StreamServe Persuasion SP5 Oracle Database

Transcription

1 StreamServe Persuasion SP5 Oracle Database Database Guidelines Rev A

2 StreamServe Persuasion SP5 Oracle Database Database Guidelines Rev A STREAMSERVE, INC. ALL RIGHTS RESERVED United States patent #7,127,520 No part of this document may be reproduced or transmitted in any form or by any means, electronic or mechanical, for any purpose, without the express written permission of StreamServe, Inc. Information in this document is subject to change without notice. StreamServe Inc. assumes no responsibility or liability for any errors or inaccuracies that may appear in this book. All registered names, product names and trademarks of other companies mentioned in this documentation are used for identification purposes only and are acknowledged as property of the respective company. Companies, names and data used in examples in this document are fictitious unless otherwise noted. StreamServe, Inc. offers no guarantees and assumes no responsibility or liability of any type with respect to third party products and services, including any liability resulting from incompatibility between the third party products and services and the products and services offered by StreamServe, Inc. By using StreamServe and the third party products mentioned in this document, you agree that you will not hold StreamServe, Inc. responsible or liable with respect to the third party products and services or seek to do so. The trademarks, logos, and service marks in this document are the property of StreamServe, Inc. or other third parties. You are not permitted to use the marks without the prior written consent of StreamServe, Inc. or the third party that owns the marks. Use of the StreamServe product with third party products not mentioned in this document is entirely at your own risk, also as regards the StreamServe products. StreamServe Web Site

3 3 Contents About StreamServe repositories...5 Overview of StreamServe repositories...6 StreamServe Enterprise Repository... 7 Runtime repository... 8 StreamServe archive... 9 Web content repository Tools for handling StreamServe repositories...11 Installing StreamServe repositories...13 Hardware and software requirements...14 Software requirements Hardware requirements Server configuration Installing an Oracle database...16 Creating an Oracle database Increasing Unicode support Changing non-default database parameters Disabling changed locking behavior Placing data files and redo-log files Configuring of Oracle Net Creating StreamServe repositories...21 Checking StreamServe repositories...22 Adjusting StreamServe repositories...23 Tables with highest growth rate and number of rows Editing the StreamServe tablespaces Indexing Columns suitable for indexing Indexing columns in the runtime repository Indexing columns in the StreamServe archive Partitioning Tables suitable for partitioning Uninstalling StreamServe repositories...30 Maintaining StreamServe repositories...31 Top jobs, input jobs and output jobs...33 Deleting expired jobs from the runtime repository...34 Job deletion process Prerequisites and recommendations Design Center configurations Job deletion schedule Scheduling StreamServers to delete jobs Scheduling Task Schedulers to delete jobs Scheduling the database to delete jobs... 40

4 4 Updating top job statuses in the runtime repository Job status update process Recommendations Scheduling StreamServers to update job statuses Scheduling Task Schedulers to update job statuses Scheduling the database to update job statuses Gathering statistics Gathering sample statistics Gathering system statistics Rebuilding indexes Ensuring sufficient free space Running maintenance procedures as database jobs Creating database jobs using DBMS_SCHEDULER Creating database jobs using DBMS_JOB Monitoring maintenance sessions and jobs Monitoring jobs in the runtime repository Queries when monitoring jobs and job statuses Queries when monitoring job delete performance Performing backup of StreamServe repositories Appendix A - Time scheduling syntax Appendix B - SQL EXPLAIN PLAN and datatype RAW... 65

5 About StreamServe repositories 5 About StreamServe repositories This document provides an overview of how to install, maintain, and back up the StreamServe repositories running on Oracle Database. Intended audience This document is intended for developers, for example StreamServe consultants, who are also familiar with Oracle databases. The document also contains some information that may be of interest for a database administrator, for example information about the tables with the highest growth rate and how StreamServe jobs are deleted from the runtime repository. Information about third party products This document does not describe how to carry out configurations in third party products, for example in Oracle SQL*Plus or Oracle SQL Developer. For such information, see the Oracle user documentation. In this section Overview of StreamServe repositories on page 6. Tools for handling StreamServe repositories on page 11.

6 6 Overview of StreamServe repositories About StreamServe repositories Overview of StreamServe repositories StreamServe uses the following repositories: StreamServe Enterprise Repository. Runtime repository. StreamServe archive (for StreamStudio Collector). Web content repository (for StreamStudio Composition Center). Figure 1 Overview, StreamServe repositories and components For detailed information about the StreamServe components that interacts with the repositories, see the Control Center documentation.

7 Overview of StreamServe repositories 7 About StreamServe repositories StreamServe Enterprise Repository A StreamServe Enterprise Repository contains information about computers, StreamServe applications, and StreamServe application domains for a company or organization. You use one central enterprise repository, installed on a specified database server. If document types are used to categorize documents (for example, as invoices or orders), the enterprise repository is the master storage for these document types. The enterprise repository is also the master storage for the StreamStudio Composition Center template versions. All communication with the enterprise repository is handled via a management gateway. For example, Control Center communicates with the enterprise repository through a management gateway. Several management gateways can connect to the same enterprise repository. Figure 2 Communicating with the enterprise repository Default schema owner and password Default database user name The schema owner and password are the same as the database user name. That is, StrsSERAccess StrsSERAccess The management gateway uses this default user when accessing the enterprise repository.

8 8 Overview of StreamServe repositories About StreamServe repositories Runtime repository The runtime repository stores jobs and job related information in queues. The repository also contains security profiles and web access information for the StreamStudio web applications. Any persistent resources are also stored in the repository. For example, document definitions for StreamStudio Composition Center and exception rules to pause service-enabled StreamServe Messages. If you pause Messages, the Messages are stored in a separate queue called the Message storage. If you use the Document Broker Plus solution, the documents are stored in a queue called the Post-processing storage. Each StreamServe application domain requires a separate runtime repository. For example, one repository can be used for the StreamServe applications in development, and another for the StreamServe applications in production. One runtime repository is shared by all the applications in the application domain. The runtime repository is accessed by StreamServer, Archiver and Task Scheduler applications. When a StreamStudio web portal accesses the runtime repository, the requests and responses are sent through the service gateway. Figure 3 Communicating with the runtime repository Default schema owner and password dbo Default repository user strsdataandqueues The StreamServer user. The user handles all StreamServe jobs and queues. strssecurity The service gateway user. The user handles the security related tables in the runtime repository, and the user authentication. strsweb The StreamStudio user. The users that you create in StreamStudio use this user to connect to the runtime repository.

9 Overview of StreamServe repositories 9 About StreamServe repositories StreamServe archive The StreamServe archive stores output documents and related metadata to be accessed from StreamStudio Collector. The StreamServe archive is optimized for searching and querying for documents. Each application domain can access one single StreamServe archive. However, one StreamServe archive can be shared by several application domains. An Archiver application transfers documents and related metadata from the runtime repository to the StreamServe archive according to schedules defined in Control Center. When the Collector user searches for documents in the StreamServe archive, all requests and responses are sent through the service gateway. Figure 4 Communicating with the StreamServe archive Default schema owner and password Default repository user The default repository name, which is the name that the Control Center user assigns when the StreamServe archive is created. See the Control Center documentation. strsweb The user that Collector uses to access the StreamServe archive.

10 10 Overview of StreamServe repositories About StreamServe repositories Web content repository The web content repository is used by StreamStudio Composition Center for storing document definitions, resources, and rules during the document design phase. When a document definition is approved, the document definition (together with its resources and rules), is set to published in the runtime repository, and is available to the StreamServer application that produces the document. The web content repository communicates exclusively with Composition Center. Figure 5 Communicating with the web content repository Default schema owner and password Default repository user dbo (if you create the application domain using the Application Domain Editor in Control Center) with the same password as the web content profile with the default user strswebcontent. strswebcontent The user that Composition Center uses to access the web content repository.

11 Tools for handling StreamServe repositories 11 About StreamServe repositories Tools for handling StreamServe repositories When handling StreamServe repositories, it is recommended to use the tools, applications and scripts provided by StreamServe. Priority order Use the priority order below when deciding which tool to use: 1 Handle as much as possible using the configuration tools provided by StreamServe, for example in Design Center, in Control Center or in the StreamStudio web applications. 2 Use StreamServe Database Administration Tool only for tasks which cannot be performed in the configuration tools above. For more information about this tool, see the Database Administration Tool documentation. 3 Use an appropriate Oracle tool, for example Oracle SQL*Plus or Oracle SQL Developer, only for tasks which cannot be performed using the StreamServe tools. When running external tools, you should primarily use the database scripts provided by StreamServe. For example, the StreamServe maintenance scripts referred to in this document.

12 12 Tools for handling StreamServe repositories About StreamServe repositories

13 Installing StreamServe repositories 13 Installing StreamServe repositories This chapter contains information on how to install and configure an Oracle database for StreamServe and how to create the StreamServe repositories. In this section Hardware and software requirements on page 14. Installing an Oracle database on page 16. Creating StreamServe repositories on page 21. Checking StreamServe repositories on page 22. Adjusting StreamServe repositories on page 23. Uninstalling StreamServe repositories on page 30. Related topics For information on how to set up StreamServe for Oracle RAC (Real Application Cluster), see the Cluster Guidelines.

14 14 Hardware and software requirements Installing StreamServe repositories Hardware and software requirements The requirements below applies for the database server where you install the runtime repository and the StreamServe archive. In this section Software requirements on page 14. Hardware requirements on page 14. Server configuration on page 14. Software requirements Oracle Database It is recommended to use Oracle Database Enterprise Edition for large installations where the StreamServe archive is used. For a complete list of supported versions, see the Supported platforms and software documentation. Hardware requirements 2 CPUs (minimum). The required number of CPUs depend on the data volume and the number of users. If several StreamServe processes run concurrently against the same database, more CPUs are recommended. 8 GB RAM (recommended minimum). The required memory depends on the data volume and the number of users. A rule of thumb is 1 GB for the SGA (System Global Area) and 1 GB for the PGA (Process Global Area). It is recommended to run 64-bit Oracle. Server configuration The computer where you install the database needs lots of memory and numerous fast striped disks. Since StreamServer is an update/insert heavy application, it is recommended to use RAID 1+0 (not RAID-5) for data. A good I/O strategy is the Oracle recommended S.A.M.E. method (stripeand-mirror-everything, using a stripe size of 1 MB). For more information, see the Oracle user documentation. Another strategy, especially if the disks are single-disks, is to implement Oracle ASM (Oracle Automatic Storage Management) with fail over groups. ASM uses software raid with stripe size depending on the type of Oracle file, and is capable of re-balancing the disks if new disks are added. If you use ASM, you must use RMAN (Oracle Recovery Manager) for database backups. For more information, see the Oracle user documentation.

15 Hardware and software requirements 15 Installing StreamServe repositories Single-disks are not recommended. S.A.M.E or ASM is recommended (note that ASM is possible to use with single disks). Single disks without ASM are only recommended for very small sites with few users (by small means less than or equals documents in the database, and less than or equal to 10 concurrent sessions). If single disks are to be used, the following general guidelines apply: Place three control files on physically different disks. Place two redo log members per redo log group on different dedicated physical disks. Spread the data files and temp files on the disks, except the one used for redo log files.

16 16 Installing an Oracle database Installing StreamServe repositories Installing an Oracle database A basic Oracle database is sufficient for StreamServe. StreamServe does not depend on any additional database options, such as Java in the database, Oracle Text, Oracle Multimedia, Spatial, etc. When the Oracle database is installed, you can create the StreamServe repositories. See Creating StreamServe repositories on page 21. In this section Creating an Oracle database on page 16. Increasing Unicode support on page 17. Changing non-default database parameters on page 18. Disabling changed locking behavior on page 19. Placing data files and redo-log files on page 20. Configuring of Oracle Net on page 20. Creating an Oracle database Prerequisites The time zone must be set to UTC both for the database server where you install the runtime repository and for the database server where you install the StreamServe archive. Before you create the database, you must specify the following non-default database server parameter. Parameter Recommended Comment db_block_size 8192 Must be at least If the database server has lots of memory (more than 4 GB), this value can be raised to To create an Oracle database 1 Create a UTF database with: Character set AL32UTF8. National character set AL16UTF16.

17 Installing an Oracle database 17 Installing StreamServe repositories 2 Set the database time zone to UTC (that is, GMT). You can do this either by editing the database creation scripts and adding a time zone clause to the CREATE DATABASE command: SET TIME_ZONE = 'UTC' or by running the command below and restarting the database after creation (but before creating the StreamServe repositories): ALTER DATABASE SET TIME_ZONE = 'UTC'; SHUTDOWN IMMEDIATE STARTUP 3 Make all tablespaces locally managed. 4 Make all tablespaces (where applicable) automatic segment space managed (SEGMENT SPACE MANAGEMENT AUTO, which is default on Oracle 11g). 5 If the database character set uses a multi-byte character encoding scheme and the default Unicode support is not enough, you can increase the Unicode support, see Increasing Unicode support below. Increasing Unicode support By default StreamServe uses NVARCHAR2, which normally offers enough Unicode support. To achieve a more general Unicode support, you can specify the database server parameter below. The parameter must be specified before the StreamServe repositories are created. Parameter Value Comment nls_length_ semantics char StreamServe recommends the default byte value, since the char value requires more space in the database. For more information about the nls_length_semantics parameter, see the Oracle user documentation, and the Oracle Support document Examples and limits of BYTE and CHAR semantics usage (NLS_LENGTH_SEMANTICS) (Doc ID ).

18 18 Installing an Oracle database Installing StreamServe repositories Changing non-default database parameters If you are not satisfied with your Oracle installation, you may have to change the following non-default database server parameters. After changing the parameters, you may have to restart the Oracle database. Parameter name Recommended value Comment job_queue_ processes <default> The StreamServer application does not use any database jobs. However, the database administrator may need the job_queue_processes parameter for administrative jobs. processes is a minimum value. Larger environments, where several StreamServer applications are used, may need a higher value. pga_aggregate_ target <Machine RAM> - <Non Oracle RAM> - sga_target The relationship between pga_aggregate_target and sga_target depends on the number of users. Start with: sga_target = pga_target sga_target <Machine RAM> - <Non Oracle RAM> - pga_aggregate_target The relationship between pga_aggregate_target and sga_target depends on the number of users. Start with: sga_target = pga_target

19 Installing an Oracle database 19 Installing StreamServe repositories Disabling changed locking behavior This section applies for Oracle and later, and previous Oracle 11 versions with Oracle patch applied (changed locking behavior). Problem If a service-enabled Message is paused by an exception rule, the Message is stored in a Message storage in the runtime repository. The Message storage tables are created by the StreamServer application the first time the application is started. If there are other ongoing transactions against certain tables in the runtime repository, the StreamServer application may fail when enabling the required foreign key constraints. The following error message is displayed: ORA-00054: resource busy and acquire with NOWAIT specified Solution A supported work-around from Oracle is to disable the patch by running the statement below. Disabling the patch means that no exclusive access to the parent table will be required when enabling foreign key constraints. Since the parameter is dynamic, you do not have to restart the Oracle database after disabling the patch. ALTER SYSTEM SET "_fix_control"=' :off' scope=both; You can run the following query to check that the patch was disabled: SELECT bugno,value,is_default FROM V$SYSTEM_FIX_CONTROL WHERE bugno = ; The following should be returned: BUGNO VALUE 0 IS_DEFAULT 0

20 20 Installing an Oracle database Installing StreamServe repositories Placing data files and redo-log files It is recommended to use S.A.M.E or ASM (see Hardware and software requirements on page 14). However, if neither S.A.M.E. nor ASM is used, the following is recommended (Oracle best practice): At least two control files on different disks. At least two redo-log members per group, placed on different dedicated disks (no data files on these disks). Spread the data files on the remaining disks. For more information, see the Oracle user documentation. Configuring of Oracle Net It is recommended to put the following parameter in sqlnet.ora on the database server: SQLNET.EXPIRE_TIME = 10 This ensures that Oracle looks for and removes sessions that for some reason are left but not used, for example due to network problems or that a client machine goes down. The number set is the minute interval when a probe is sent from Oracle to verify if client connections are still active. For more information, see the Oracle user documentation.

21 Creating StreamServe repositories 21 Installing StreamServe repositories Creating StreamServe repositories You use StreamServe Control Center to configure and create the StreamServe repositories. You can either create the repositories directly in Control Center, or you can generate the database scripts in Control Center and then run the scripts using an external tool. For example, if the company security policy prevents Control Center from connecting to the database, or if you want to have full traceability of the repository creation. If Control Center is not available at all, you can carry out the corresponding procedures using the command line utilities. For more information, see the StreamServe Command line utilities documentation. Modifying repository users In the Control Center configurations, default repository users are suggested (see Overview of StreamServe repositories on page 6). It is recommended to change these users before creating the repositories. If Control Center is not available, you can modify the users directly in the template files. These files are located in: Windows: <StreamServe installation>\applications\management\<version>\ etc\databasescripts\<version>\oracle\security UNIX: <StreamServe installation>/applications/managementgateway/ etc/databasescripts/<version>/oracle/security Prerequisites An Oracle database must be installed before you create the StreamServe repositories, see Installing an Oracle database on page 16. To create the StreamServe repositories See the Control Center documentation for detailed information on how to: Configure the StreamServe repositories. Create the StreamServe repositories directly in Control Center. Generate the database scripts in Control Center and then execute the scripts. Note: Since the scripts contains passwords, it is recommended to delete the scripts after you have created a repository.

22 22 Checking StreamServe repositories Installing StreamServe repositories Checking StreamServe repositories After creating a repository, it is recommended to check the repository. Check the log files for errors When creating a repository in Control Center, you can check a short version of the log file in Control Center, or you can open the full log file in a text editor from within Control Center. When executing the generated scripts using an external tool, you can use the logging features of this tool. For example when using Oracle SQL*Plus, you can use the SPOOL command. For more information, see the Oracle user documentation. Perform a sanity test You should always perform a sanity check to make sure the repository was created according to the configurations. Check the database for INVALID schema objects Check the database for INVALID schema objects. Run the query below in Oracle SQL*Plus as a DBA user. No rows should be returned. SELECT FROM WHERE owner, object_name dba_objects status='invalid'; If you find INVALID objects, you can recompile these objects by running the following query: EXEC DBMS_UTILITY.compile_schema(schema => '<Schema name>');

23 Adjusting StreamServe repositories 23 Installing StreamServe repositories Adjusting StreamServe repositories An enterprise repository that you create in Control Center can be used for development and testing purposes and may also be sufficient for a production environment. If specific security or performance requirements apply, you may have to adjust the enterprise repository to fit the actual conditions. A runtime repository and a StreamServe archive that you create in Control Center are sufficient for development and testing purposes. However, before using these repositories in a production environment, you most likely have to adjust the repositories to fit the actual conditions. For example, you may have to edit tablespaces, index columns, and partition tables and indexes. It is recommended to create the repositories using Control Center and then adjust the repositories using the appropriate Oracle tool, for example Oracle Enterprise Manager or Oracle SQL*Plus. In this section Tables with highest growth rate and number of rows on page 23. Editing the StreamServe tablespaces on page 24. Indexing on page 26. Partitioning on page 29. Tables with highest growth rate and number of rows Runtime repository The runtime repository tables with highest growth rate and highest number of rows are: BinaryObject BlobInfo Part VariableMetadata FixedMetadata CXMESSAGE_<8-DigitNumber> Note: Only if Messages are paused in the Message storage. CXPOST_<8-DigitNumber> Note: Only if documents are stored in the Post Process storage. StreamServe archive The StreamServe archive table with highest growth rate and highest number of rows is: Meta_<5-DigitNumber>

24 24 Adjusting StreamServe repositories Installing StreamServe repositories Editing the StreamServe tablespaces When the runtime repository is created, all segments (tables, indexes, etc) are by default created in a tablespace called USERS. The tablespace is a logical storage unit within the Oracle database. In Oracle Enterprise Manager or Oracle SQL*Plus, you can edit the default tablespace to fit the actual conditions. Using different tablespaces enables you to control the disk layout. For example, you can place a heavily used index on a fast disk and a rarely accessed database table on a less expensive, but slower disk. Tablespaces for dynamically created tables Message / Post-processing storage At runtime, tables and indexes are created in the first found tablespace called <Name>_DATA (for tables) and <Name>_INDEX (for indexes) in which the repository owner has a quota. If no such tablespaces are found, the segments are created in the default tablespace for the repository owner. StreamServe archive At runtime, tables, indexes, and blobs are created in the first found tablespace called <Name>_DATA (for tables), <Name>_INDEX (for indexes), and <Name>_LOB (for blobs) in which the repository owner has a quota. If no such tablespaces are found, the segments are created in the USERS tablespace. To edit the StreamServe tablespaces 1 Create the required tablespaces (locally managed, automatic segment space management). For example: <Name>_DATA for data. <Name>_INDEX for index. <Name>_LOB for blobs. 2 Give quotas to the StreamServe repository owner in the created tablespaces. 3 Open the build_move_segments.sql script, located in: <Base directory>\root\config\database\5.5.0\strsdata\oracle Where <Base directory> is the path specified for StreamServe Projects during the StreamServe Framework and Control Center installation. For example: C:\ManagementGateway\1.0 4 Run the build_move_segments.sql script as the schema owner. The script spools the DDL (Data Definition Language) move commands to a text file. 5 Check/verify the created text file move_streamserve_segments_temp.sql and run the script. For example: running build_move_segments.sql: Enter current index tablespace: USERS Enter current LOB tablespace: USERS Enter new data tablespace: <Name>_DATA Enter new index tablespace: <Name>_INDEX

25 Adjusting StreamServe repositories 25 Installing StreamServe repositories Enter new LOB tablespace: <Name>_LOB 6 Verify that there are no errors in the log file move_streamserve_segments_ TEMP.log.

26 26 Adjusting StreamServe repositories Installing StreamServe repositories Indexing For most tables, performance will be improved by indexing one or several of the columns. You should consider indexing columns that are used when filtering out a relatively small amount of rows from the tables. Indexing columns improves the speed of the data retrieval operations, but at the cost of increased storage space and slower writes. If a table is subjected to lots of inserts, increasing the number of indexes may have negative effect on the insert performance. You must take this into consideration when deciding whether to index or not. You index columns using the appropriate Oracle tool, for example Oracle SQL*Plus or Oracle SQL Developer. Prerequisites To create an index in your own schema, one of the following must apply: The table or cluster to be indexed must be in your own schema. You must have the INDEX object privilege on the table to be indexed. To create an index in another schema, you must have the CREATE ANY INDEX system privilege. Also, the owner of the schema to contain the index must have space quota on the tablespaces to contain the index or index partitions. In this section Columns suitable for indexing on page 26. Indexing columns in the runtime repository on page 27. Indexing columns in the StreamServe archive on page 28. Columns suitable for indexing The columns most suitable for indexing are usually the ones with user defined metadata, configured in StreamServe Design Center. Consult the end-users for information about which document types and which metadata are most frequently used and index the corresponding columns.

27 Adjusting StreamServe repositories 27 Installing StreamServe repositories Indexing columns in the runtime repository If the runtime repository contains Message storages or Post-processing storages, you should consider indexing these storages. Message storage When a Message is paused by an exception rule, the Message is stored in the Message storage. The Messages can then be invoked via service calls, for example web service calls from Ad Hoc Correspondence or Correspondence Reviewer. If a large amount of rows are stored in a Message storage, the service call performance may be improved by indexing one or several of the columns. The columns most suitable for indexing are the ones for user defined metadata (that is, the ones configured with the Message context in Design Center). Example 1 Indexing a column in the Message storage In this example, an index is added to the column COL_1 in a Message storage called CXMESSAGE_ The column is indexed using the following command: CREATE INDEX ixlcol_1 ON CXMESSAGE_ (COL_1); Post-processing storage When Document Broker Plus is used, the documents are stored in a Postprocessing storage. Metadata is used when searching for, retrieving and postprocessing the documents. If a large amount of rows are stored in the storage, the search performance may be improved by indexing one or several of the columns. The columns most suitable for indexing are the ones for user defined metadata (that is, the ones configured with the Post-processing context in Design Center). Example 2 Indexing a column in the Post-processing storage In this example, an index is added to the column COL_1 in the Post-processing storage called cxpost_ The column is indexed using the following command: CREATE INDEX ixlcol_1 ON stc.cxpost_ (col_1);

28 28 Adjusting StreamServe repositories Installing StreamServe repositories Indexing columns in the StreamServe archive When an end-user searches for documents in StreamStudio Collector, metadata is used as search criteria. If a large amount of rows are stored in the metadata tables, the search performance may be improved by indexing the columns for this metadata. The columns most suitable for indexing are the ones for user defined metadata (configured with the Archive context in Design Center). Example 3 Indexing metadata In this example, an index is added to the metadata column Col_1 for the document type table META_ CREATE INDEX ixlcol_1 ON Meta_00001(Col_1);

29 Adjusting StreamServe repositories 29 Installing StreamServe repositories Partitioning For large tables (for example, larger than 2 GB), performance may be improved by partitioning tables and indexes. Tables and indexes are then split into smaller components, where each component can be managed and manipulated individually. For example, when deleting documents from the StreamServe archive you can drop one or several partitions instead of deleting millions of documents. Partitioning is a separately licensed option, on top of Oracle Database Enterprise Edition. When setting up partitioning, it is recommended to consult an Oracle expert. For detailed information, see the Oracle user documentation. If the tables contain data, conversion to partitioning is an extensive operation that requires recreating and reloading of data into the tables. Tables suitable for partitioning Runtime repository BinaryObject The table has no field that is suited for range partitioning, but you can use hash partitioning on the field PartID in order to spread I/O and for parallel SQL operations. BlobInfo The table can be range partitioned on field CreationDate, although it requires global indexes for primary key and foreign keys indexes. You can also hash partition it on, for example, the field PartID. Part The table can be range partitioned on field CreationDate, although it requires global indexes for primary key and foreign keys indexes. You can also hash partitioned it on, for example, the field PartID. VariableMetadata FixedMetadata CXMESSAGE_<8-DigitNumber> CXPOST_<8-DigitNumber> StreamServe archive Meta_<5-DigitNumber> The field BlobInfo.CreationDateTime is suitable for range partitioning.

30 30 Uninstalling StreamServe repositories Installing StreamServe repositories Uninstalling StreamServe repositories When you uninstall StreamServe, you can choose to remove the StreamServe components. This uninstalls the files and services for the StreamServe repositories, but does not delete the repositories or the repository users. Under certain circumstances, you may want to drop a repository. For example, if you upgrade to a later StreamServe release, and want to create a new enterprise repository instead of upgrading the existing repository. When you drop a repository, all data within the repository is lost. Prerequisites Before dropping a repository, the following must be fulfilled: The StreamServe components must be removed. If not, the repository services are not deleted, resulting in a corrupt StreamServe installation. For information on how to uninstall and remove StreamServe components, see the Installation Guide. There must be no active sessions running against the repository. To read active sessions To read active sessions, you can run the following command Oracle SQL*Plus as a DBA user, for example SYSDBA. SELECT username, count(username) FROM v$session WHERE username like 'strs%' GROUP BY username To drop a repository To drop a repository, you can run the following command as a user with privileges at least corresponding to a DBA user. DROP USER <Schema name> CASCADE;

31 Maintaining StreamServe repositories 31 Maintaining StreamServe repositories You maintain the StreamServe repositories using standard Oracle maintenance procedures. For recommended maintenance activities, including error handling and performance improvement strategies, see the Oracle user documentation. StreamServe specific maintenance tasks If you experience performance bottlenecks related to the database, you can tune the StreamServe specific maintenance tasks below to optimize your environment. Note: If you do not experience any performance bottlenecks, you should keep the default setup and the default scheduling. Job deletion You can optimize the way in which expired top jobs and job status information are deleted from the runtime repository. Job status update To be deleted from the runtime repository, the status of a top job must be set to completed. You can optimize the way in which the status of top jobs are updated. Maintenance package StreamServe provides scripts for gathering statistics and rebuilding indexes. These scripts are described in this chapter. Running the StreamServe specific maintenance tasks charges the database and may affect the database performance. This chapter contains some recommendations regarding setting up and scheduling of the maintenance tasks. Rather than being absolute truths, these recommendations are to be considered as starting points for trying out the most optimal settings for your specific job setup and environment. The tuning of the maintenance tasks is a continuous assignment. You must monitor the jobs in the runtime repository to ensure that the tasks are correctly scheduled and that the database does not grow over time. You should always schedule the maintenance tasks in a way that ensures minimum impact on any other database activities.

32 32 Maintaining StreamServe repositories In this section Top jobs, input jobs and output jobs on page 33. Deleting expired jobs from the runtime repository on page 34. Updating top job statuses in the runtime repository on page 41. Gathering statistics on page 46. Rebuilding indexes on page 49. Running maintenance procedures as database jobs on page 52. Monitoring jobs in the runtime repository on page 55.

33 Top jobs, input jobs and output jobs 33 Maintaining StreamServe repositories Top jobs, input jobs and output jobs This section describes the concept of top jobs. You must be aware of this concept when scheduling the tasks for job deletion and the job status update. Top jobs, input jobs and output jobs StreamServe jobs are divided into top jobs, input jobs and output jobs. A top job is created when a StreamServer application receives input, for example from an ERP system. Each top job is divided into one or more input jobs processed by the StreamServer application. When the application processes an input job, it produces one or more output jobs. This means a top job can be divided into several input jobs, and each input job can be divided into several output jobs. All jobs that belong to the same top job are identified by the same Tracker ID. A top job is completed when all included input and output jobs are completed. Figure 6 Top job, input jobs and output jobs

34 34 Deleting expired jobs from the runtime repository Maintaining StreamServe repositories Deleting expired jobs from the runtime repository Expired top jobs and job status information must be deleted from the runtime repository. The job deletion must also include expired Messages/documents in Message/Post-processing storages in the runtime repository. Three ways of deleting jobs In this documentation, three ways of deleting jobs are described: Schedule the StreamServer applications to delete jobs (default). Schedule one or several Task Scheduler applications to delete jobs. Schedule the database to delete jobs. Low volume installation with a single StreamServer application By default, each StreamServer application is set up to perform job data deletion tasks every hour. Successfully completed job data is kept in the repository for 5 days. Failed job data is kept for a month. The job deletion is regulated in the repositorymanager.xml configuration file for each StreamServer application. This job deletion is suitable for less complex, low volume installations with a single StreamServer application. To optimize the delete performance, you may have to update the schedule. High volume installations with several StreamServer applications For medium to high volume installations, where several StreamServer applications try to follow the instructions in repositorymanager.xml at the same time, it is recommended to let one or several StreamServe Task Scheduler applications delete the jobs instead. Using Task Scheduler applications may improve performance by centralizing the status updates and enabling more complex and controlled schedules. If you prefer to handle the job deletion outside the StreamServe software, you can set up a job scheduling task for deleting jobs directly in the database instead of using Task Scheduler applications. In this section Job deletion process on page 35. Prerequisites and recommendations on page 36. Scheduling StreamServers to delete jobs on page 38. Scheduling Task Schedulers to delete jobs on page 39. Scheduling the database to delete jobs on page 40.

35 Deleting expired jobs from the runtime repository 35 Maintaining StreamServe repositories Job deletion process This section describes the order in which expired top jobs and expired Messages/ documents in Message/Post-processing storages are deleted. If you schedule StreamServer applications to delete jobs, the stored procedures are automatically executed in the correct order. If you schedule Task Scheduler applications or the database to delete jobs, you should make sure that the tasks are executed as described below. Step 1 Expire jobs In StreamServe Design Center, expiry times for successfully processed and failed top jobs are configured. The jobs remains in the queues for the expiry time and are then ready to be handled by the job deletion process. Note: You can also manually expire jobs using StreamStudio Reporter or StreamServe Database Administration Tool. Step 2 Delete expired Messages and documents The pc_deleteexpireddocabstraction stored procedure deletes any expired Messages/documents from Message/Post-processing storages. Step 3 Mark jobs for deletion The pq_deletemarkexpiredtopparts stored procedure searches for expired top jobs that are ready to be deleted and marks these jobs for deletion. Step 4 Delete marked jobs The pq_pickdeleteevent stored procedure deletes any expired top jobs marked for deletion. Only top jobs without any related Messages/documents in Message/ Post-processing storages can be deleted. Note: An archiving job cannot be deleted unless all related documents are first transferred to the StreamServe archive.

36 36 Deleting expired jobs from the runtime repository Maintaining StreamServe repositories Prerequisites and recommendations For optimal performance when deleting jobs and job status information, you must consider both the Design Center configurations and the scheduling of the job deletion. In this section Design Center configurations on page 36. Job deletion schedule on page 37. Design Center configurations It is strongly recommended to configure the input and output queues to store both successful and failed jobs (Store information and job should be selected for successful and failed jobs in the Manage Queues dialog box). Note: Both information and jobs are always initially stored in the runtime repository. If the options to store nothing or store information only are used, expired input and output jobs are continuously deleted from the queues. This results in an increased number of delete transactions and a decreased delete performance. To delete successful jobs, the deletion process must be allowed to delete successful jobs (Delete successful jobs must be enabled in the Configure Platform dialog box. It is recommended to configure a short expiry time for the successfully processed jobs (the Allow deletion after setting in the Configure Platform dialog box). As a rule of thumb, you should keep the number of top jobs marked for deletion to a minimum. This has a great impact on the delete performance, especially if each top job generates one single output job. However, you must also consider how long the customer wants to access successful jobs in the queues. To delete failed jobs, the job deletion process must also be allowed to delete failed jobs (Delete failed jobs must be enabled in the Configure Platform dialog box). An expiry time for the failed top jobs must be configured (the Allow deletion after setting in the Configure Platform dialog box). You must consider how long the customer wants to access the failed jobs in the queues. If possible, avoid using notifications (Use notifications should be cleared in the Configure Platform dialog box). For more information about the options, see the Design Center documentation.

37 Deleting expired jobs from the runtime repository 37 Maintaining StreamServe repositories Job deletion schedule It is recommended to schedule the job deletion in the following way: Primarily, you should run the job deletion at a time period when the StreamServer applications are idle or the job throughput is low. For example, after scheduled batch jobs or when the average CPU usage for the database falls below a specified value and remains below this level for a specified time period. Note: You must make sure that the available time period is longer than the time interval required to complete the job deletion after a peak load. If the available time periods are too short or if the workload is continuous, you should start the job deletions at an available time period and then schedule the remaining deletions in the following way: In general, schedule a continuous job deletion with a high frequency. Exception If most of your top jobs generate many output jobs, you may receive a better delete performance by running job deletion less frequent or (if possible) after each top job is successfully completed.

38 38 Deleting expired jobs from the runtime repository Maintaining StreamServe repositories Scheduling StreamServers to delete jobs By default, the job deletion is scheduled in the repositorymanager.xml file for each StreamServer application. Every 20 minutes, each StreamServer application searches for and marks any expired top jobs that are ready to be deleted. Every 60 minutes, each StreamServer application triggers the deletion of marked jobs, expired Messages and expired documents. You can edit the default schedules to fit the actual conditions. Location of repositorymanager.xml You can either edit the configuration in: The template file (used for all new StreamServer applications), located in: Windows: <StreamServe installation>\applications\management\<version>\ etc\config\<version>\strscs\ UNIX: <StreamServe installation>/applications/managementgateway/ etc/config/<version>/strscs/ The changes that you make to s template file only apply for new applications, created after the changes are done. They do not apply for already created applications. The configuration file for a specific StreamServer application, located in: Windows: <Base directory>\root\applications\<application>\<layer> UNIX: <Project location>/applications/<application>/<layer> Prerequisites and recommendations See Prerequisites and recommendations on page 36. To edit the schedule for job deletion 1 Open the repositorymanager.xml file. 2 Edit the following lines: <deleteevent schedule="t II * * MH * * 60" />  <deletemarkevents> <toppart use="default" schedule="t II * * MH * * 20" /> </deletemarkevents> For syntax, see Appendix A - Time scheduling syntax on page 63.

39 Deleting expired jobs from the runtime repository 39 Maintaining StreamServe repositories Scheduling Task Schedulers to delete jobs You can add a StreamServe Task Scheduler application and configure separate tasks for deleting expired top jobs, expired Messages and expired documents. To ensure job deletion even if the Task Scheduler goes down, you can add several Task Scheduler applications. Prerequisites and recommendations See Prerequisites and recommendations on page 36. For information about the order in which the stored procedures should be executed, see Job deletion process on page 35. Post requisites It is recommended to comment out the corresponding lines for job deletion in the repositorymanager.xml files for the StreamServer applications. See Scheduling StreamServers to delete jobs on page 38. To schedule a Task Scheduler to delete jobs 1 In StreamServe Control Center, right-click the application domain and select New Application. The New Application dialog box opens. 2 Configure the application properties for the new Task Scheduler. Note: You cannot use the name Task Scheduler if you run an application on a Windows host, since this name is used by a Windows service. 3 Click OK. The Configuration dialog box opens. 4 Select the (Item list) field and click the button to the right of the field. The Service Configuration dialog box opens. 5 Add and configure the required tasks: Delete expired jobs Delete expired Messages Select to mark expired top jobs for deletion and delete these expired jobs from the runtime repository at the scheduled interval. Threads The maximum number of threads to be used when deleting expired jobs marked for deletion. Several treads enables the application to delete several jobs in parallel. Note that only the first thread searches for and marks jobs for deletion. Note: Each thread consumes system resources Select to delete expired Messages from Message storages in the runtime repository at the scheduled interval.

40 40 Deleting expired jobs from the runtime repository Maintaining StreamServe repositories Delete expired documents Select to delete expired documents from Post-processing storages in the runtime repositoryat the scheduled interval. For more information, see the Control Center documentation. Scheduling the database to delete jobs You can schedule the database to delete expired top jobs, expired Messages and expired documents. The task should be scheduled in a way that ensures minimum impact on any other database activities. You create job scheduling task in the appropriate Oracle tool, for example in Oracle Enterprise Manager. Prerequisites and recommendations See Prerequisites and recommendations on page 36. For information about the order in which the stored procedures should be executed, see Job deletion process on page 35. Post requisites It is recommended to comment out the corresponding lines in repositorymanager.xml files for the StreamServer applications. See Scheduling StreamServers to delete jobs on page 38. To schedule the database to delete jobs 1 Create a job scheduling task, for example in Oracle Enterprise Manager. 2 In the task, use the script run_pickdeleteevent.sql, located in: <Base directory>\root\config\database\<version>\strsdata\ oracle\maintenance\ Where <Base directory> is the path specified for StreamServe Projects during the StreamServe Framework and Control Center installation. For example: C:\ManagementGateway\1.0

41 Updating top job statuses in the runtime repository 41 Maintaining StreamServe repositories Updating top job statuses in the runtime repository A successful top job cannot be deleted from the runtime repository unless the status of the top job is updated to completed. Neither can any documents be transferred to the StreamServe archive unless the top job status is completed. A failed top job cannot be deleted unless the status is updated to aborted. Three ways of updating job status In this documentation, three ways of updating top job statuses are described: Schedule the StreamServer applications to update statuses (default). Schedule one or several Task Scheduler applications to update statuses. Schedule the database to update statuses. Low volume installation with a single StreamServer application By default, all StreamServer applications update top job statuses every minute. The ability to update the statuses and the time interval are regulated in the repositorymanager.xml configuration file for each StreamServer application. This way of updating top job statuses is suitable for less complex, low volume installations with a single StreamServer application. To optimize the performance, you may have to update the schedule. High volume installations with several StreamServer applications For medium to high volume installations, where several StreamServer applications try to follow the instructions in repositorymanager.xml at the same time, it is recommended to let one or several StreamServe Task Scheduler applications update the job statuses instead. Using Task Scheduler applications may improve performance by centralizing the status updates and enabling more complex and controlled schedules. If you prefer to handle the status updates outside the StreamServe software, you can set up a job scheduling task for updating job statuses directly in the database instead of using Task Scheduler applications. In this section Job status update process on page 42. Recommendations on page 42. Scheduling StreamServers to update job statuses on page 43. Scheduling Task Schedulers to update job statuses on page 44. Scheduling the database to update job statuses on page 45.

42 42 Updating top job statuses in the runtime repository Maintaining StreamServe repositories Job status update process Step 1 Update status When updating a status, a pq_pickreadystatusevent stored procedure is used to check and update the status of the top jobs. A top job is completed when all included sub jobs are completed. The top job is aborted if any of the included sub jobs has failed the maximum number of retries. Step 2 Report status When the top status is updated, one application in the application domain can report the status. For example, in the application log file. After being reported, the status is consumed and is no longer available to the other applications. This means that one application can check and update the status, and another application can report and consume the status. Of these two operations, the status update is the most resource intensive. Note: If StreamServe Status Messenger is used to generate reports in the application domain, you must make sure that the application that runs the Design Center Project for the Status Messenger is the only one that reports the status. Recommendations It is recommended to schedule the job status update in the following way: The job status update task must be scheduled in relation to the job deletion tasks and any archiving tasks. The job status update task should run more frequent than these tasks. See Deleting expired jobs from the runtime repository on page 34. In general, the following applies: If most of the top jobs generate one or a few output jobs, it is recommended to schedule a continuous job status update with a high frequency. If most of the top jobs generate many output jobs, it is recommended to schedule a job status update with a lower frequency. Since updating statuses is more resource intensive than reporting statuses, you should let only one application (or, if you use several application for redundancy, as few as possible) perform the update operation.

43 Updating top job statuses in the runtime repository 43 Maintaining StreamServe repositories Scheduling StreamServers to update job statuses By default, every StreamServer application updates top job statuses every minute. The ability to update the statuses and the time interval are regulated in the repositorymanager.xml configuration file for each StreamServer application. You can edit the default schedule to fit the actual conditions. Location of repositorymanager.xml You can either edit the configuration in: The template file (used for all new StreamServer applications), located in: Windows: <StreamServe installation>\applications\management\<version>\ etc\config\<version>\strscs\ UNIX: <StreamServe installation>/applications/managementgateway/ etc/config/<version>/strscs/ The changes that you make to a template file only apply for new applications, created after the changes are done. They do not apply for already created applications. The configuration file for a specific StreamServer application, located in: Windows: <Base directory>\root\applications\<application>\<layer> UNIX: <Project location>/applications/<application>/<layer> Recommendations See Recommendations on page 42. To edit the schedule for updating job statuses 1 Open the repositorymanager.xml file. 2 Edit the following line: <readystatusevent schedule="t II * * MH * * 1" update="true" report="true" /> Note: If Status Messenger is used to generate status reports, you must keep report="true" for the StreamServer application that runs the Status Messenger Project and set report="false" for all other applications. For syntax, see Appendix A - Time scheduling syntax on page 63.

44 44 Updating top job statuses in the runtime repository Maintaining StreamServe repositories Scheduling Task Schedulers to update job statuses You can add a StreamServe Task Scheduler application and configure a task to update the statuses of top jobs. To ensure job status updates even if the Task Scheduler application goes down, you can add several applications. Recommendations See Recommendations on page 42. Post requisites You must disable the configuration in the repositorymanager.xml files for the StreamServer applications. You can either comment out the corresponding line or you can keep the configuration, but change to update="false". See Scheduling StreamServers to update job statuses on page 43. Note: As an alternative, you can override the setting in the configuration file by using the startup argument -statusevent 0. This argument must be applied on each Project for which you want to override the setting. For more information, see the Startup argument reference manual. Note: If Status Messenger is used to generate status reports, you must keep the configuration in repositorymanager.xml (with update="false" and report="true") for the StreamServer application that runs the Status Messenger Project. To schedule a Task Scheduler to update job statuses 1 In StreamServe Control Center, right-click the application domain and select New Application. The New Application dialog box opens. 2 Configure the application properties for the new Task Scheduler. Note: You cannot use the name Task Scheduler if you run an application on a Windows host, since this name is used by a Windows service. 3 Click OK. The Configuration dialog box opens. 4 Select the (Item list) field and click the button to the right of the field. The Service Configuration dialog box opens. 5 Add and configure the task for updating job status: Update job status Select to update and report the statuses of top jobs according to the scheduled interval. Update status Select to let the application update the statuses of top jobs. Report status Select to let the application report updated statuses. Note: Do not select Report status if Status Messenger is used to generate status reports. For more information, see the Control Center documentation.

45 Updating top job statuses in the runtime repository 45 Maintaining StreamServe repositories Scheduling the database to update job statuses You can schedule the database to update the statuses of top jobs. The task should be scheduled in a way that ensures minimum impact on the other database activities. You create job scheduling task in the appropriate Oracle tool, for example in Oracle Enterprise Manager. Recommendations See Recommendations on page 42. Post requisites You must disable the configuration in the repositorymanager.xml files for the StreamServer applications. You can either comment out the corresponding line or you can keep the configuration, but change to update="false". See Scheduling StreamServers to update job statuses on page 43. Note: As an alternative, you can override the setting in the configuration file by using the startup argument -statusevent 0. This argument must be applied on each Project for which you want to override the setting. For more information, see the Startup argument reference manual. Note: If Status Messenger is used to generate status reports, you must keep the configuration in repositorymanager.xml (with update="false" and report="true") for the StreamServer application that runs the Status Messenger Project. To schedule the database to update job statuses 1 Create a job scheduling task, for example in Oracle Enterprise Manager. 2 In the task, use the script run_pickreadystatusevent.sql, located in: <Base directory>\root\config\database\<version>\strsdata\ oracle\maintenance\ Where <Base directory> is the path specified for StreamServe Projects during the StreamServe Framework and Control Center installation. For example: C:\ManagementGateway\1.0

46 46 Gathering statistics Maintaining StreamServe repositories Gathering statistics In this section Gathering sample statistics on page 46. Gathering system statistics on page 48. Gathering sample statistics The Oracle 11g automatic statistics gathering is not always sufficient. It is therefore recommended to gather sample statistics for all StreamServe schemas every night (one sample is sufficient). You must gather sample statistics when there are rows in the Part table and the QStatusReport table. If the tables are frequently emptied, you should gather the statistics when the tables have a lot of data and then lock the statistics in the whole schema. Do not forget to unlock the statistics if you wish to gather the statistics again. The procedure described in this section collects statistics in the following way: For ALL objects in the schema (not only the ones that Oracle judges as having missing or stale statistics). Histograms are collected (FOR ALL COLUMNS SIZE AUTO). Oracle decides when to invalidate cursors (DBMS_STATS.AUTO_INVALIDATE). Sample size is set through procedure parameters. If these are not set, then DBMS_STATS.AUTO_SAMPLE_SIZE is used. Parallel degree is set through procedure parameters. If these are not set, then DBMS_STATS.AUTO_DEGREE is used.

47 Gathering statistics 47 Maintaining StreamServe repositories Example 4 gather_stats NAME: gather_stats -- PURPOSE: Gather optimizer statistics on StreamServe schema -- PARAMETERS: -- v_estimate_pct: Estimation percent in dbms_stats command -- - Don't set to let Oracle decide the -- sample size (AUTO_SAMPLE_SIZE) v_parallel_degree: Run n stats gathering processes in -- parallel -- - Don't set to let Oracle decide the -- parallelism -- (AUTO_DEGREE, depends on table size, -- CPU's and initialization parameters) -- - Set to NULL to run using default table -- parallelism -- - Set to 1 to run using no parallelism v_estimate_pct IN NUMBER DEFAULT DBMS_STATS.AUTO_SAMPLE_SIZE, v_parallel_degree IN NUMBER DEFAULT DBMS_STATS.AUTO_DEGREE); Example 5 Gathering statistics according to specification In this example, the gather sample size is 5%, letting Oracle decide on the parallelism. BEGIN maintenance.gather_stats (v_estimate_pct=>5); END; / Example 6 Gathering statistics when Oracle decides sample size and parallelism In this example, Oracle decides the sample size and the degree of parallelism. BEGIN maintenance.gather_stats; END; /

48 48 Gathering statistics Maintaining StreamServe repositories Gathering system statistics It is recommended to gather system statistics on one occasion under normal system load using DBMS_STATS.GATHER_SYSTEM_STATS(). For more information see Oracle Database Performance Tuning Guide 11g Release 2 (11.2) > Managing Optimizer Statistics > System statistics. Example 7 Gathering system statistics In the following example, you must change the XX* values before running: Create table to hold stats BEGIN -- Create table to hold stats DBMS_STATS.CREATE_STAT_TABLE ( ownname => 'XXMYOWNER', stattab => 'XXMYTABLENAME', tblspace => 'XXMYTABLESPACE'); END; / Export old system stats from dictionary to created stattable BEGIN DBMS_STATS.EXPORT_SYSTEM_STATS ( statown => 'XXMYOWNER', stattab => 'XXMYTABLENAME', statid => 'XXMYSTATID_OLD'); END; / Gather new system stats for one hour under normal system load BEGIN DBMS_STATS.GATHER_SYSTEM_STATS ( gathering_mode => 'INTERVAL', interval => 60, statown => 'XXMYOWNER', stattab => 'XXMYTABLENAME', statid => 'XXMYSTATID_NEW'); END; / Import the gathered system stats from stattable to dictionary BEGIN DBMS_STATS.IMPORT_SYSTEM_STATS ( statown => 'XXMYOWNER', stattab => 'XXMYTABLENAME' statid => 'XXMYSTATID_NEW'); END; /

49 Rebuilding indexes 49 Maintaining StreamServe repositories Rebuilding indexes Oracle does not recommend rebuilding of indexes. Instead you should use the COALESCE function. For more information, see the Oracle user documentation. However, if your indexes would require rebuilding, you can use the scripts provided in this section. For example, if rows has been removed from a (previously) large table to which no new rows will be added. When rebuilding indexes, you must make sure there is sufficient free space in the destination tablespaces. See Ensuring sufficient free space on page 50. Example 8 rebuild_indexes NAME: rebuild_indexes -- PURPOSE: Rebuild all StreamServe indexes -- PARAMETERS: -- v_maxrunhours: Max hours to run (will not exit immediately, -- but no new loop iteration will be started) v_indexname_regexp:optional POSIX regexp case-insensitive -- search string for only rebuilding a subset -- of all indexes e.g. all indexes starting -- with letter A-H: '^[A-H].*'$' Reference: "Multilingual Regular Expression -- Syntax" in "Oracle Database SQL Reference" v_tablespace_dest: Move indexes to new tablespace when -- rebuilding -- v_parallel_mb_breakpoint: Run in parallel for indexes larger -- than this MB -- v_online: Run rebuild commands with the ONLINE -- clause -- v_sort_area_size_mb: Set sort_area_size in MB for session -- (Uses default Oracle initialization -- parameter settings if not set) MISCELLANEOUS: -- Direct privilege "CREATE TABLE" is needed for package owner v_maxrunhours IN NUMBER DEFAULT 24*365, v_indexname_regexp IN VARCHAR2 DEFAULT '.*', v_tablespace_dest IN VARCHAR2 DEFAULT NULL, v_parallel_mb_breakpoint IN INTEGER DEFAULT -1, v_online IN BOOLEAN DEFAULT TRUE, v_sort_area_size_mb IN INTEGER DEFAULT -1);

50 50 Rebuilding indexes Maintaining StreamServe repositories Example 9 Rebuilding indexes according to specification In this example, the indexes for all search tables (that is, tables that begins with "META_") in the StreamServe archive are rebuilt in the following way: Rebuild to the same tablespace (that is, v_tablespace_dest not set). Use offline mode (since this is faster than the online mode). Use parallel mode for all indexes larger than 50 MB. Break after first index done after 12 hours. Use custom sort_area_size of 100 MB per parallel thread. Note: Parallel or online rebuild requires the Enterprise Edition of the Oracle Database. BEGIN maintenance.rebuild_indexes ( v_online=>false, v_tablename regexp=>'meta_.*', v_parallell_mb_breakpoint=>50, v_maxrunhours=>12, v_sort_area_size_mb=>100 ); END; / Example 10 Rebuilding indexes using default options In this example, the indexes for all tables in the StreamServe archive are rebuilt using the default options: BEGIN maintenance.rebuild_indexes (); END; / Ensuring sufficient free space When rebuilding indexes, you must make sure there is sufficient free space in the destination tablespaces. The sufficient free space must be at least as much as the largest index. You must run the SQL query as a DBA user, for example as SYSTEM. When running the SQL to check free space, make sure that smartfree_mb (free space plus remaining auto extended space) is larger than maxindex_mb (size of largest index in tablespace).

51 Rebuilding indexes 51 Maintaining StreamServe repositories Note: You must make sure that there is space left in the file system for the data files to extend. Example 11 SQL to check free space SELECT tablespace_name tablespace, ROUND(SUM(filesize)/1024/1024) filesize_mb, ROUND((SUM(filesize)-SUM(filefree))/1024/1024) used_mb, ROUND(SUM(autoextend_max)/1024/1024) autoextend_max_mb, ROUND((SUM(filefree)+SUM(autoextend_max)-SUM(filesize))/1024/ 1024) smartfree_mb, ROUND(MAX(maxindexbytes)/1024/1024) maxindex_mb FROM (SELECT tablespace_name, 0 filesize, SUM(bytes) filefree, 0 autoextend_max, 0 maxindexbytes FROM dba_free_space GROUP BY tablespace_name UNION ALL SELECT tablespace_name, SUM(bytes) filesize, 0 filefree, sum(greatest(bytes,maxbytes)) autoextend_max, 0 maxindexbytes FROM dba_data_files GROUP BY tablespace_name UNION ALL SELECT tablespace_name, 0 filesize, 0 filefree, 0 autoextend_max, MAX(bytes) maxindexbytes FROM dba_segments WHERE segment_type = 'INDEX' GROUP BY tablespace_name) GROUP BY tablespace_name ORDER BY tablespace_name;

52 52 Running maintenance procedures as database jobs Maintaining StreamServe repositories Running maintenance procedures as database jobs In Oracle 11g, there are two ways to create database jobs: Using DBMS_SCHEDULER (recommended for administrative reasons). See Creating database jobs using DBMS_SCHEDULER on page 52. Using DBMS_JOB (old way, applicable for already existing frameworks). See Creating database jobs using DBMS_JOB on page 52. Creating database jobs using DBMS_SCHEDULER When using DBMS_SCHEDULER for running database jobs, it is recommended to configure Oracle Enterprise Manager (either the Database Control or the Grid Control), and use the GUI to create/schedule/monitor the jobs. Creating database jobs using DBMS_JOB Using DBMS_JOB is the old way to create a database job. Oracle recommends using DBMS_SCHEDULER instead, see Creating database jobs using DBMS_SCHEDULER above. Example 12 Gathering statistics using DBMS_JOB In this example, the job gathers optimizer statistics every night at 5:00 AM. The gather sample size is 5%, and Oracle decides the degree of parallelism. DECLARE jobno NUMBER; BEGIN DBMS_JOB.SUBMIT ( jobno, 'maintenance.gather_stats ( v_estimate_pct=>5);',-- Command to_date(' :00:00','YYYY-MM-DD HH24:MI:SS'),-- Startdate 'sysdate+1' ); END; / COMMIT; / -- Interval To schedule another procedure, simply replace the Command parameters Startdate and Interval.

53 Running maintenance procedures as database jobs 53 Maintaining StreamServe repositories Monitoring maintenance sessions and jobs You must run all SQL queries below as a DBA user, for example as SYSTEM. Note: As an alternative to the queries below, you can use the Oracle Enterprise Manager. Information about sessions Run this query to get information on current maintenance sessions: SELECT module, sid, serial# client_info, TO_CHAR (logon_time, 'YYYY-MM-DD HH24:MI:SS') logon_time, last_call_et, username, status FROM v$session WHERE module LIKE 'maintenance.%' ORDER BY module, sid, serial#; Information about current maintenance session operations Run this query to get information on current maintenance session operations: SELECT s.module, s.sid, s.serial#, s.client_info, so.username, opname, target_desc, sofar, totalwork, units, TO_CHAR (start_time, 'YYYY-MM-DD HH24:MI:SS') start_time, TO_CHAR (last_update_time, 'YYYY-MM-DD HH24:MI:SS') last_update_time, time_remaining, elapsed_seconds FROM v$session s, v$session_longops so WHERE s.sid = so.sid AND s.serial# = so.serial# AND s.module LIKE 'maintenance.%' AND so.sofar < so.totalwork ORDER BY s.module, s.sid, s.serial#;

54 54 Running maintenance procedures as database jobs Maintaining StreamServe repositories DBMS_SCHEDULER Current running jobs Run this query to get information on the currently running DBMS_SHEDULER jobs: SELECT * FROM dba_scheduler_running_jobs ORDER BY owner, job_name; DBMS_SCHEDULER Job history Run this query to get information on the DBMS_SHEDULER job history: SELECT * FROM dba_scheduler_job_run_details ORDER BY owner, job_name, log_date; DBMS_JOB Current running jobs Run this query to get information on the currently running DBMS_JOB jobs: SELECT s.username, s.osuser, s.client_info, s.module, j.* FROM v$session s, dba_jobs_running j WHERE s.sid = j.sid ORDER BY s.username; DBMS_JOB Job history Run this query to get information on the DBMS_JOB job history: SELECT * FROM dba_jobs ORDER BY schema_user;

55 Monitoring jobs in the runtime repository 55 Maintaining StreamServe repositories Monitoring jobs in the runtime repository This section contains some useful tips when monitoring jobs in the runtime repository. For example, when monitoring top job statuses (completed, failed, removed, etc.) in order to get an overview the job deletion process. When monitoring jobs and job statuses, it is recommended to primarily use the tools provided by StreamServe. As a complement to the StreamServe tools, you can also run queries against the repository. For example, to receive a summary of the total number of top jobs in each processing status. For more information, see: Queries when monitoring jobs and job statuses on page 56. Queries when monitoring job delete performance on page 59. Database Administration Tool You can use StreamServe Database Administration Tool to monitor jobs and job statuses. For example, to get an overview of the statuses of the current top jobs and the included input and output jobs. Based on the information, you can then further administer the jobs. For example, expire and delete top jobs, and cancel sub jobs. For more information, see the Database Administration Tool documentation. StreamStudio Reporter With the StreamStudio Reporter application you can administer jobs that are received, processed, and produced by StreamServer applications. For example, you can view job status, resend jobs and delete jobs from the queues. For more information, see the Reporter documentation. StreamServe Status Messenger You can use Status Messenger to generate status reports for jobs. For example, if a job fails, an can be generated and sent to a system administrator who can take the proper precautions. For more information, see the Status Messenger documentation. To to use Status Messenger, you must enable notifications in the Configure Platform dialog box (the Use notifications setting). Enabling of notifications always affects the performance.

56 56 Monitoring jobs in the runtime repository Maintaining StreamServe repositories Queries when monitoring jobs and job statuses This section contains some queries which you can use to monitor jobs and the processing statuses for the jobs in the runtime repository. Run the queries as a DBA user, for example SYSDBA, in the appropriate Oracle tool, for example Oracle SQL*Plus or Oracle SQL Developer. Available processing statuses A processing status describes the last completed processing status of a job. The following statuses are available: Status Description 0 N/A A processing status is not applicable (for example, when a job is being created). No processing state has yet been finished. 1 Completed The job is completed, with or without errors. 2 Cancelled The processing of the job is cancelled. 3 Aborted The job has failed the maximum number of retries. 4 Removed The job is marked for deletion and will be removed. 5 Failed over Another StreamServer application has taken over the job due to failure of execution. StructureTypeID In the queries, <PartType_Top_Empty> is the following Structure Type ID for top jobs, as described in the dbo.structuretype table in the runtime repository: AB04C A-B78B AA194 In this section Retrieving a summary of the current top job statuses on page 57. Retrieving a summary of events waiting for processing on page 57. Retrieving the queues that hold events waiting for processing on page 57. Retrieving the total number of failed top jobs on page 57. Retrieving all error messages on page 58. Retrieving a summary of each type of error message on page 58. Retrieving failed jobs with error messages on page 58. Retrieving expired top jobs with error codes and statuses on page 58.

57 Monitoring jobs in the runtime repository 57 Maintaining StreamServe repositories Retrieving a summary of the current top job statuses To retrieve the number of current top jobs for each processing status with a certain error code, run the query below. SELECT count(*), ProcessingStatus, ErrorCode FROM Part WHERE StructureTypeID='<PartType_Top_Empty>' AND ExpiringDateTime < getutcdate() GROUP BY ProcessingStatus, ErrorCode ; Retrieving a summary of events waiting for processing To retrieve the total number of events waiting for processing, run the query below. SELECT count(*), StatusCodeEvent FROM QStatusReport GROUP BY StatusCodeEvent ; Retrieving the queues that hold events waiting for processing To retrieve information about which queues hold the events waiting for processing, run the query below. SELECT FROM count(*), p.structuretypeid, aq.queuename QStatusReport q inner join Part p on q.partid = inner join Queue aq on q.queueid = aq.queueid WHERE StatusCodeEvent = 1 GROUP BY p.structuretypeid, aq.queuename ; Retrieving the total number of failed top jobs To retrieve the total number of failed top jobs, run the query below. SELECT FROM WHERE count(*) Part ((ErrorCode < o) OR (ErrorCode >= 0)) AND ProcessingStatus NOT IN (0,4) AND StructureTypeID='<PartType_Top_Empty>' ;

58 58 Monitoring jobs in the runtime repository Maintaining StreamServe repositories Retrieving all error messages To retrieve the last error messages for all failed jobs, run the query below. SELECT distinct LastErrorMessage FROM FixedMetaData WHERE LastErrorMessage IS NOT null ; Retrieving a summary of each type of error message To retrieve the number of each type of last error message for all failed jobs, run the query below. SELECT count(*), LastErrorMessage FROM FixedMetaData WHERE LastErrorMessage IS NOT null GROUP BY LastErrorMessage ; Retrieving failed jobs with error messages To retrieve all failed jobs and the last error message for each job, run the query below. SELECT PartID, LastErrorMessage FROM FixedMetaData WHERE LastErrorMessage IS NOT null ; Retrieving expired top jobs with error codes and statuses To retrieve all expired top jobs and the corresponding error codes and processing statuses, you can run the query below. SELECT PartID, TrackerID, ErrorCode, ProcessingStatus FROM Part WHERE StructureTypeID='<PartType_Top_Empty>' AND ExpiringDateTime < getutcdate() ORDER BY ProcessingStatus, ErrorCode ;

59 Monitoring jobs in the runtime repository 59 Maintaining StreamServe repositories Queries when monitoring job delete performance This section contains some queries which you can use to monitor the performance of the job deletion process. Run the queries as a DBA user, for example SYSDBA, in the appropriate Oracle tool, for example Oracle SQL*Plus or Oracle SQL Developer. In this section Retrieving a summary of the spread of job deletion on page 59. Retrieving the number of generated top jobs on page 60. Retrieving the number of notifications on page 60. Retrieving a summary of the spread of job deletion When the prescribed Design Center setting Store information and job is selected for successful and failed jobs, the job deletion is carried out on top job level. This is the recommended setting, resulting in the most optimal delete performance. However, if the No or Store information only setting is selected instead, the job deletion is spread among input and output jobs as well. This results in an increased number of delete events and a decreased delete performance. A typical case where this can occur is if you run an upgraded Project with old settings. For example, an upgraded StreamServe 4.x Project or a StreamServe Persuasion Project, upgraded from a version older than SP4. If you experience a decreased delete performance, you can run the query below to find out if and how the job deletion is spread over top jobs, input jobs, and output jobs. SELECT count(*), StructureTypeID FROM QStatusReport q inner join Part p on q.partid = p.partid WHERE q.statuscodeevent=-1 GROUP BY StructureTypeID ; To interpret the result, you must look up the structuretypeid values in the dbo.structuretype table in the runtime repository: Delete event on top jobs PartType_Top_Empty Delete event on input jobs PartType_Inputdata_Entity Delete event on output jobs PartType_Data_Entity If the job deletion is spread over input and output jobs, it is strongly recommended to select the Store information and job setting in Design Center.

60 60 Monitoring jobs in the runtime repository Maintaining StreamServe repositories Retrieving the number of generated top jobs To decide if the job deletion is scheduled frequently enough, you can run a query to find out the number of top jobs generated during a specified time span. For example, you can run the query below to find out the number of top jobs generated each 10 minutes since If the number of generated top jobs exceeds 5000, the job deletion should be run more frequently than every 10 minutes. In the query, <PartType_Top_Empty> is the following Structure Type ID for top jobs, as described in the dbo.structuretype table in the runtime repository: AB04C A-B78B AA194 SELECT count(*), CreDate FROM (SELECT SUBSTR(TO_CHAR(CreationDateTime, 'YYYY-MM-DD HH24:MI:SS'), 0, 15) CreDate FROM Part WHERE StructureTypeID = '<PartType_Top_Empty>' AND CreationDateTime >= TO_TIMESTAMP (' ','YYYY-MM-DD')) GROUP BY CreDate ; Retrieving the number of notifications Too many notifications may result in decreased delete performance, especially if each top job generates a single (or a few) output jobs. If you experience decreased performance, you can run the query below to find out the total number of notifications in the repository. SELECT count(*) FROM Notification ; If the number notifications is extensive, you may consider reducing the notifications. For example, by using the -reducenotifications startup argument in Design Center. For more information, see the Startup arguments reference manual and the Design Center documentation.

61 Performing backup of StreamServe repositories 61 Performing backup of StreamServe repositories Regarding database backup, it is recommended to carry out normal Oracle procedures. It is recommended to run any production database in archiving mode. Put the database in archiving mode, and make sure that a full backup is performed after the installation of StreamServe. For more information, see the Oracle user documentation.

62 62 Performing backup of StreamServe repositories

63 Appendix A - Time scheduling syntax 63 Appendix A - Time scheduling syntax The time scheduling syntax looks like the following: 1 The overall start time for the schedule. The time value is always in local time for the computer that is parsing it. The number is formatted in the following way: YYYYMMDDhhmmssfff, where: Y: year M: month D: day h: hour m: minute s: second f: millisecond If no start time exists, the time can be replaced with *. 2 The overall stop time for the schedule, formatted as described above. 3 The type of time interval that the scheduling concerns: Y: Year MY: Month WY: Week of year WM: Week of month DY: Day of year DM: Day of month DW: Day of week H: Hour MH: Minute S: Second MS: Millisecond

64 64 Appendix A - Time scheduling syntax 4 The start value for the time interval. If no start value exists, the value can be replaced with *. 5 The stop value for the time interval. If no stop value exists, the value can be replaced with *. 6 The step value for the time interval. Note: The start and stop values will include the specified values. For example, if you specify "T II * * MH "/, the scheduled event will run at 03:00, 05:00, 07:00, 09:00, 11:00, 13:00, and 15:00. Example 13 Scheduling deletion of marked jobs To delete marked jobs every year: <deleteevent schedule="t II * * Y * * 1"/> To delete marked jobs every second week: <deleteevent schedule="t II * * WY * * 2"/> To delete marked jobs every second hour between 12:00 and 20:00 every day: <deleteevent schedule="t II * * H "/> To delete marked jobs every five minutes between minute 10 and minute 50 in every hour: <deleteevent schedule="t II * * MH "/>

65 Appendix B - SQL EXPLAIN PLAN and datatype RAW 65 Appendix B - SQL EXPLAIN PLAN and datatype RAW Reading or writing data in an Oracle database is done by issuing SQL statements. When Oracle receives such a statement, a query execution plan is built. The execution plan defines how Oracle finds or writes data and can be used when troubleshooting performance issues. If a database administrator wants to explore an execution plan, the SQL statement EXPLAIN PLAN can be used. StreamServe uses datatype RAW(16) In Oracle, StreamServe uses datatype RAW(16) for the majority of the GUIDs. When a RAW column is used in the WHERE clause of a query, the value must also be RAW. If not, Oracle will not use any index on that column. The application code uses RAW bind variables, but any SQL statement that is used to explain the SQL plan must be written correctly. You cannot use string constants (for example 'my_raw_value') since this would not be the same SQL statement as the one run by the StreamServe applications. To display an SQL explain plan for datatype RAW To display the SQL explain plan that Oracle actually uses when a column is of datatype RAW, you can do one of the following: For most correct result Check the actual SQL statement run by the application code (for example, by using Oracle Enterprise Manager). Put the SQL statement in a PL/SQL block, use a RAW(16) bind variable and perform an SQL-trace while running the SQL statement, and then format the trace file and check the plan. For example: DECLARE my_raw_value RAW(16):= '<MYRAWVALUE>' BEGIN SELECT * FROM my_table WHERE my_raw_field = my_raw_value; END; / Easiest during testing and development phases Use the CAST function in your SQL statement. Note that this is only for testing, it is not how it is done in the application code (where RAW bind variables are used). For example: SELECT * FROM my_table WHERE my_raw_field = CAST(my_raw_value AS RAW(16));

66 66 Appendix B - SQL EXPLAIN PLAN and datatype RAW