Perforce Helix Threat Detection On-Premise Deployment Guide Version 3
On-Premise Installation and Deployment 1. Prerequisites and Terminology Each server dedicated to the analytics server needs to be identified as an analytics or reporting server. There needs to be an odd number of analytics servers, and one of the analytics servers is identified as the master server. Some valid configurations are: one analytics server and one reporting server; three analytics servers and two reporting servers; etc. The steps to configure the two types of servers are given separately in this document. The base OS for all servers is Ubuntu 14.04 LTS. The hostnames for the servers are arbitrary but the instructions in this document will refer to the master analytics server using these names: <SPARK_MASTER> <NAMENODE> <HMASTER> <ZOOKEEPER> In all cases the tag must be replaced with the actual hostname of the master analytics server. The reporting server will also be referred to as <REPORTING>. This can be replaced with the hostname of any of the reporting servers. Before you begin you should have the following files available. The files will be copied onto the analytics and reporting servers in the steps below. Analytics deployment bundle: wget --no-check-certificate... Reporting deployment bundle: wget --no-check-certificate
2. Reference Architecture The architecture consists of the following components: Investigator / API Server Analytics Master Analytics Data. 2.1. Investigator / API Server This component is responsible for taking the results of the analytics, and present it to a consumption interface. This includes the Investigator interface, static reports, as well as through a RESTful interface for integration into 3rd party systems. 2.2. Analytics Master This component is responsible for managing the analytics data-tier and for the orchestration of the analytics jobs. 2.3. Analytics Data This component is responsible for storing the data and running the analytical models on the data which creates the metrics such as baselines, behaviour risk scores, entity risk scores, and others. It stores and serves up log data to the Analytics Master component, as well as storing the metrics that result from the analytical models.
3. System Requirements These system requirements provide guidelines on the resources that are necessary to run the system based on typical usage. These guidelines are subject to re-evaluation based on usage patterns within each organization, which can vary. 3.1. POC (Proof of Concept) System Maximum: 30 days of data / 1k users CPU Cores 16 Memory 32 GB HDD 100 GB Network GigE
3.2. Production System Investigator / API Server (x2 for High Availability System) Minimum Recommended CPU Cores 8 16 Memory 16 GB 24 GB HDD 100 GB 100 GB Network GigE 10GbE Analytics Master (x2 for High Availability System) Minimum Recommended CPU Cores 8 16 Memory 8 GB 16 GB HDD 100 GB 100 GB Network GigE 10GbE Analytics Data (x3-5 for High Availability System) Minimum Recommended CPU Cores 8 16 Memory 32 GB 64 GB
HDD 100 GB 70 GB / 1k users / month Network GigE 10GbE
4. Server Setup You will need two Ubuntu 14.04 LTS servers. Install prerequisites on both servers. sudo apt-get install wget unzip openssh-server The default /etc/hosts file should look like this: 127.0.0.1 localhost 127.0.1.1 <server-name> Change it to look like this: 127.0.0.1 localhost $ACTUAL_IP_ADDRESS $ANALYTICS_SERVER_NAME 4.1. Create Users Create the interset user on the server: The default ubuntu user password will need to be provided when executing the adduser command and a new password will need to be provided for the interset user that is being created. As the default ubuntu user: sudo su useradd -m -d /home/interset -s /bin/bash -c "Interset User" -U interset usermod -a -G sudo interset echo "# User rules for Interset" >> /etc/sudoers.d/90-cloud-init-users echo "interset ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers.d/90-cloud-init-users exit Set a password for the interset user: sudo passwd interset All steps following this should be done as the interset user, so at this point log out and log back in as interset. 4.2. Interset Folder sudo mkdir /opt/interset sudo chown interset:interset /opt/interset 4.3. SSH Keys On the first server (usually the 'analytics' server) create an ssh key: ssh-keygen Then copy the key to the other server: ssh-copy-id interset@<$hostname>
Ensure that you're able to ssh from each server to the other without entering a password (i.e. ssh <server> should give you a remote shell without prompting for a password). 4.4. Java 8 Install Java 8: sudo add-apt-repository ppa:webupd8team/java sudo apt-get update sudo apt-get install -y oracle-java8-installer sudo apt-get -fy install To check, you can run java -version and you should see output like: java version "1.8.0_25" Java(TM) SE Runtime Environment (build 1.8.0_25-b17) Java HotSpot(TM) 64-Bit Server VM (build 25.25-b02, mixed mode) Set JAVA_HOME: echo " " >> ~/.bashrc echo "export JAVA_HOME=/usr/lib/jvm/java-8-oracle" >> ~/.bashrc Source [.bashrc] to pick up newly set environment variables: source ~/.bashrc
5. Analytics Server Setup All of the steps in this section should be done as the interset user on the analytics server. 5.1. Install Analytics Download the analytics-deploy file to the /opt/interset directory: cd /opt/interset wget http://.../analytics-3.0.x.xxx-bin.tar.gz tar xvfz analytics-3.0.x.xxx-bin.tar.gz rm analytics-3.0.x.xxx-bin.tar.gz ln -s /opt/interset/analytics-3.0.x.xxx analytics cd /opt/interset/analytics/automated_install/ sudo./deploy.sh Note: The script will initially need input from the user for the I.P. address of the Analytics server and heap sizes, please have that information available. When entering the memory heap size only enter the number. This script takes some time to execute, look for the following message for confirmation it has completed: Execution of [deploy.sh] complete. 5.2. HDFS Format HDFS (name node only): cd /opt/interset/hadoop/bin./hdfs namenode -format Answer Yes to the prompt to re-format the filesystem in the Storage Directory. Note: Do NOT run the format command if HDFS is already running as it will cause data loss. If this is first-time setup then HDFS will not be running You should see something like the following (note the Exiting with status 0 on the fifth-last line): https://gist.github.com/tamini/eb63dda92cc688a9db22 Install and Start the HDFS services: sudo ln -s /opt/interset/analytics/bin/hdfs.service.sh /etc/init.d/hdfs sudo update-rc.d hdfs defaults sudo service hdfs start Answer yes to the prompt asking if you wish to continue connecting. After entering the above start commands, type jps as a check, the output will look like (ignore the process IDs/numbers):
13342 Jps 13278 DataNode 13135 NameNode Another good check is to load up the HDFS web-ui. By default it can be found at: http://hostname:50070 Where hostname is the namenode running HDFS. 5.3. HBase Install and Start the HBase services: sudo ln -s /opt/interset/analytics/bin/hbase.service.sh /etc/init.d/hbase sudo update-rc.d hbase defaults sudo service hbase start To check everything is running as it should, use jps again, it should output (ignore numbers): For the name node: 16619 HMaster 16799 HRegionServer 16521 HQuorumPeer 16182 DataNode 17063 Jps 16026 NameNode The HBase web-ui is also available at: http://hostname:60010 5.4. Spark Install and Start the Spark services: sudo ln -s /opt/interset/analytics/bin/spark.service.sh /etc/init.d/spark sudo update-rc.d spark defaults sudo service spark start As a test, use the jps command, output should look like the following (on single deployment): 28352 HMaster 28258 HQuorumPeer 29140 Worker 28538 HRegionServer 27723 DataNode 28957 Master 29422 Jps 27567 NameNode As a quick test, run one of the examples that came with spark: /opt/interset/spark/bin/run-example SparkPi 10 It will output a lot of info and a line approximating the value of Pi. 5.5. Configure Analytics Set up a cron task to run the analytics (e.g. using crontab -e) daily:
0 0 * * * /opt/interset/analytics/bin/analytics.sh /opt/interset/analytics/conf/interset.conf Create the analytics schema (<ZOOKEEPER> is the analytics server): cd /opt/interset/analytics/bin./sql.sh --dbserver <ZOOKEEPER> --action migrate./sql.sh --dbserver <ZOOKEEPER> --action migrate_aggregates Source.bashrc to pick up newly set environment variables: source ~/.bashrc 5.6. Ingest Configure the interset.conf configuration file: cd /opt/interset/analytics/conf vi interset.conf Configure the ingestfolder, ingestingfolder and ingestedfolder to be the desired locations. Defaults will work if the file is left unaltered. Configure the reportservers with the conclusive list of all your Reporting servers. Start the ingest process: /opt/interset/analytics/bin/ingest.sh /opt/interset/analytics/conf/interset.conf Running jps will now show the Ingest process as running. Log file for the ingest is located in: tail -f /opt/interset/analytics/logs/ingest.0.log NOTE: The settings in the conf file can be modified on the fly without restarting the process/service, changing the ingest folder(s) location(s) will change where the system looks at (i.e. ingest, ingested, ingesting and ingesterror) to pick them up. You have now completed the setup of the Analytics server and the server is ready to ingest logs.
6. Reporting Server Setup All of the steps in this section should be done as the interset user on the reporting server. Install prerequisites: sudo apt-get install nginx 6.1. Install Reporting Download the Reporting archive: sudo mdkir /opt/interset sudo chown interset:interset /opt/interset cd /opt/interset wget reporting-3.0.x.xxx-deploy.tar.gz tar xzvf reporting-3.0.x.xxx-deploy.tar.gz rm -f reporting-3.0.x.xxx-deploy.tar.gz ln -s reporting-3.0.x.xxx/ reporting echo " " >> ~/.bashrc echo "export PATH=\$PATH:/opt/interset/reporting/reportGen/bin" >> ~/.bashrc sh /opt/interset/reporting/reportgen/scripts/setupreportsenvironment.sh source ~/.bashrc 6.2. Nginx sudo mv /opt/interset/reporting/nginx.conf /etc/nginx/sites-enabled/default sudo service nginx restart cd /opt/interset/reporting vi investigator.yml Change the line: url: jdbc:phoenix:$analytics:2181 So that $ANALYTICS is your Analytics server. Configure the domain in the interset-cookie with the fully-qualified host name of the reporting server. interset-cookie: domain: reporting.company.com Create the log folder: mkdir logs 6.3. Start Reporting Create the users database. This will create two users: cd /opt/interset/reporting java -jar investigator-3.0.x.xxx.jar db migrate investigator.yml The users are: User name: user, password password.
User name: admin, password password. Create and set reporting to run as a service: sudo ln -s /opt/interset/reporting/reporting.service /etc/init.d/reporting sudo update-rc.d reporting defaults Once you ve created the reporting service, reporting will now start automatically at system startup. Use the following commands to start, stop and restart the reporting service sudo service reporting start sudo service reporting stop sudo service reporting restart Start the reporting server: sudo service reporting start There is a log file for the Reporting server that you may wish to monitor: tail -f /opt/interset/reporting/logs/reporting.log The reporting web UI is available at: http://<reporting>/ You have now completed the setup of the Reporting server and the server is ready to display the results of the Analytics. You can use the accounts user / password and admin / password to log in.
7. Upgrading From Earlier Releases You can upgrade from 2.0, 2.1 or 2.2. 7.1. Ingest Ensuring there are no running Ingest daemons, to stop the ingest daemons kill -9 $(ps aux grep 'Ingest' grep -v grep awk '{print $2}') 7.2. HBase Ensuring there are no analytics jobs running, stop HBase: /opt/interset/hbase/bin/stop-hbase.sh Obtain a new version of HBase and extract the package to /opt/interset: cd /opt/interset wget https://archive.apache.org/dist/hbase/hbase-0.98.12/hbase-0.98.12-hadoop2-bin.tar.gz tar xvf hbase-0.98.12-hadoop2-bin.tar.gz Update the regionservers, hbase-site.xml & hbase-env.sh: cp /opt/interset/hbase/conf/regionservers /opt/interset/hbase-0.98.12-hadoop2/conf cp /opt/interset/hbase/conf/hbase-site.xml /opt/interset/hbase-0.98.12-hadoop2/conf cp /opt/interset/hbase/conf/hbase-env.sh /opt/interset/hbase-0.98.12-hadoop2/conf Update the hbase symlink to point to the new hbase-0.98.12-hadoop2 directory: cd /opt/interset unlink hbase ln -s hbase-0.98.12-hadoop2 hbase Obtain a new version of the Phoenix server JAR and copy the JAR into /opt/interset/hbase/lib: cd /opt/interset/hbase/lib wget https://s3-us-west-1.amazonaws.com/theta-deployment/infrastructure/phoenix-4.3.1-server.jar Copy the Modify hbase-site.xml: nano /opt/interset/hbase/conf/hbase-site.xml Modify hbase.region.server.rpc.scheduler.factory.class from - <value>org.apache.phoenix.hbase.index.ipc.phoenixindexrpcschedulerfactory</value> + <value>org.apache.hadoop.hbase.ipc.phoenixrpcschedulerfactory</value> Add the following: <property> <name>hbase.coprocessor.regionserver.classes</name> <value>org.apache.hadoop.hbase.regionserver.localindexmerger</value> </property> Restart HBase
/opt/interset/hbase/bin/start-hbase.sh 7.3. Spark Ensuring there are no analytics jobs running, stop Spark: /opt/interset/spark/sbin/stop-all.sh Obtain a new version of Spark: wget https://archive.apache.org/dist/spark/spark-1.3.0/spark-1.3.0-bin-hadoop2.4.tgz Extract package to /opt/interset/spark-1.3.0-bin-hadoop2.4 tar xvf spark-1.3.0-bin-hadoop2.4.tgz cd /opt/interset/spark-1.3.0-bin-hadoop2.4/lib wget https://s3-us-west-1.amazonaws.com/theta-deployment/infrastructure/phoenix-4.3.1-client.jar Copy the slave's settings: cp /opt/interset/spark/conf/slaves /opt/interset/spark-1.3.0-bin-hadoop2.4/conf Copy the spark-env.sh settings: cp /opt/interset/spark/conf/spark-env.sh /opt/interset/spark-1.3.0-bin-hadoop2.4/conf Edit the new spark-env.sh, modify the line that starts SPARK_CLASSPATH to read: SPARK_CLASSPATH=/opt/interset/spark/lib/phoenix-4.3.1-client.jar Save the file. Update the spark symlink to point to the spark-1.3.0-bin-hadoop2.4 directory Start Spark /opt/interset/spark/sbin/start-all.sh 7.4. Analytics Obtain updated analytics bundle. Unpack bundle into a new analytics-3.0... directory under /opt/interset. Create or redirect an analytics symlink to point to the new unpacked directory. (If upgrading from 2.0) Create ssh key (hit ENTER at each prompt): ssh-keygen ssh-copy-id interset@<reports SERVER HOSTNAME> (If upgrading from 2.0) Migrate changes from the old.../deploy/conf/ingest.conf to the new.../analytics/conf/interset.conf file. (If upgrading from 2.1 or 2.2) Migrate changes from the old interset.conf file from the old analytics directory to the new one
You can now remove the old deploy-2. or analytics-2.2 directory and the associated deploy symlink (if present). cd /opt/interset unlink analytics ln -s analytics-3.0.0.xxx analytics Update the analytics database: cd /opt/interset/analytics/bin./sql.sh --dbserver <ZOOKEEPER> --action migrate./sql.sh --dbserver <ZOOKEEPER> --action migrate_aggregates Re-run analytics (analytics.sh). (This will also copy new search indices to the reporting server.) 7.5. Reporting Stop the reporting process or service. Obtain updated reporting bundle and unpack the bundle into a new reporting-3.0... directory under /opt/interset. Update the reporting symlink to point to the new directory Migrate changes from investigator.yml (replace $ANALYTICS with the analytics server name) Copy the reporting database investigator-db.mv.db from the previous folder to the new folder. Update the reporting database: java -jar /opt/interset/reporting/investigator-3.0.x.jar db migrate /opt/interset/reporting/investigator.yml Create and set reporting to run as a service: sudo ln -s /opt/interset/reporting/reporting.service /etc/init.d/reporting sudo update-rc.d reporting defaults Once you ve created the reporting service, reporting will now start automatically at system startup. Use the following commands to start, stop and restart the reporting service sudo service reporting start sudo service reporting stop sudo service reporting restart Start the reporting server: sudo service reporting start You can now remove the old investigator-2. directory.
8. Usage 1. Put some log files into the Watch Folder of the Analytics server. Watch Folder location is set in: /opt/interset/analytics/conf/interset.conf Default Watch Folder location is: ingestfolder = /tmp/ingest ingestingfolder = /tmp/ingest/ingesting ingestedfolder = /tmp/ingest/ingested ingesterrorfolder = /tmp/ingest/ingesterror NOTE: Folders can be modified on the fly prior to ingesting a subsequent dataset. 2. You can monitor the ingest of the dataset via an ingest log file: tail -f /opt/interset/analytics/logs/ingest.0.log Once all the log files ingested and processed (this can be verified in the ingesting and ingested folders), use the webui (i.e. Reporting API) to see the results of the analytics. The web UI is available at: http://<reporting>/
Configuring New Users To access operations under the /tenants endpoint you'll need to authenticate as the root user. The default password for the root user is root. To log into the Web application, you must create users that can authenticate in the system. By default, when the application is installed, a tenant is created, tenant 0 (zero). New users should be created in this tenant. Any references to TenantID in this section refer to tenant 0. Use the Swagger UI to access the Perforce Helix Threat Detection Analytics REST API. The Swagger UI is available at http://<reporting_server>/swagger-ui Expand the PUT /tenants/{tenantid}/users/{userid} section. Click on the lower right box ("Model Schema") to copy the schema into the lower left box. Fill in the tenantid and userid fields. You can delete the userid field from the JSON document. The userid field is the name that you will enter as the 'User ID' when logging in to the Interset Analytics web UI. Fill in the remaining fields in the JSON document: name The user's full name. role The role is admin or user. isactive This should be true. password Set a password. After filling in the document click Try It Out! to add the user. The difference between admin and user roles is that the admin role is able to perform tasks using the REST API, such as configuring new users, while the user role is not.
Security This section describes how to configure your system in order to secure the environment. Firewalls In order to secure the system, run the commands below on all servers. These commands will ensure that the system allows traffic between the servers, and blocks all incoming traffic other than ssh, http and https from any other source. service ufw start ufw allow ssh ufw allow http ufw allow https ufw allow from <ip of reporting server1> ufw allow from <ip of reporting server2> ufw allow from <ip of analytics server1> ufw allow from <ip of analytics server2> ufw allow from <ip of analytics server3> ufw deny from 0.0.0.0/0 ufw enable
Changing the Default Users' Passwords There are three default user accounts. The root user has permission to add new tenants and login accounts (users) to the system. The root user is a member of the administrative tenant (tenant ID 'adm') which doesn't contain any data. The admin and user users are members of the default tenant (tenant ID '0'). A new on-premise install is configured to use tenant ID '0' by default. The admin and user users carry the admin and user roles, respectively. In practice these roles are identical. The default passwords are root/root, admin/password, user/password. For each new install, these passwords should be changed. To change a password use POST /users/:userid. This can be done through the Swagger UI (described elsewhere in this document) or via other tools like curl. A sample curl command to change the password: curl -X POST -d '{ "name": "root", "role": "root", "isactive": true, "password": "<new password>" }' -H "Content-Type: application/json" -H "Authorization: Bearer <token>" http://server/api/users/root (See also the REST API Usage document for more guidance on using the Analytics REST API.)
TLS We recommend that you install a server TLS certificate on the reporting server and configure Nginx to use it. An updated nginx.conf file might look like this: server { listen 80; return 301 https://myserver$request_uri; } server { listen 443; server_name myserver; ssl on; ssl_certificate /etc/nginx/ssl/myserver.crt; ssl_certificate_key /etc/nginx/ssl/myserver.key; ssl_session_timeout 5m; ssl_protocols TLSv1 TLSv1.1 TLSv1.2; ssl_ciphers "HIGH:!aNULL:!MD5 or HIGH:!aNULL:!MD5:!3DES"; ssl_prefer_server_ciphers on; location /login { proxy_http_version 1.1; proxy_pass http://127.0.0.1:8080/login; } }...
Appendix 1: Configuring interset.conf The goal of this section is to describe the purpose of each configurable setting in the interset.conf file. It will also discuss some of the possible situations where these settings should be adjusted and what some possible values could be. The discussion and information contained here will be separated into 3 sections. These sections refer to each of the configurable pieces of the config file. The Dynamic Ingest Configuration, Static Ingest Configuration, and Index Generation Configuration. Questions about the setting of these values and the impacts that are not covered in the content of this document can be directed to Support. Section 1 Dynamic Ingest Configuration mode Mode refers to the way that the ingest process will be run. If this is set to daemon, this process will continue to run and continuously ingest files that are placed into the ingest folder. In the runonce mode, the process will run once, and will ingest the content of the ingest folder and then exit. This is set to daemon by default. scmtype scmtype refers to the type of information that will be passed to the ingest process. The values for this setting include perforce and repository. This should be set to the type of information being ingested. For Perforce logs, set to perforce. If a CSV file is being used, then this should be set to repository. repoformat If the scmtype is set to repository, then the reportformat line must be uncommented by removing the # from the beginning of the line. The format of the CSV file must then be entered on this line in order for the ingest process to interpret the information correctly. For example, if the CSV appears as the example below Timestamp User Client_IP Machine_Name Project Action Phone_Number Then the repoformat value should be set as follows: repoformat = TIMESTAMP,USER,CLIENT_IP,_,PROJECT,ACTION,_, The _ (underscore) will cause the ingest to ignore these fields.
ingestfolder This is the location where files to be ingested are placed. Files will be consumed by the process from here. ingestingfolder This is the location where files that are being processed are written. This is a transient location for the file as it is being processed. ingestedfolder This location is the final destination for files that have been ingested. ingesterrorfolder Files that have not been ingested correctly will be contained here. lastmodifiedthreshold This configurable value is the minimum age of the file in milliseconds before being picked up by the ingest for processing. folderscansinterval This is the value in milliseconds that the ingest process waits before scanning the ingest folder for new files to process. p4projectdepth This setting determines number of folders that constitute the path for the project name. This should only be used when using a scmtype of perforce. Please note, the //depot portion of the path does not contribute to the project depth value. For example, in the Perforce log entry below, setting this value to 2 would result in a project value of folder1/folder2 and a file value of folder3/folder4/folder5/example.txt //depot/folder1/folder2/folder3/folder4/folder5/example.txt tenantid The tenantid is a value that is assigned to the container created during the software install. This can be left as the default of 0 or can be customized to any three alpha numeric characters. This is the value that
must be referenced when additional interactive users are created that wish to access the web portal to analyze information. zkphoenix This value refers to the machine where the analytics tier of the software is installed. batchsize When the ingest process is running, this value will determine the number of records that are batched for processing. Increasing this value can have an effect on the performance of the ingest process as the more data that is read the more processing power is required. The maximum recommended value is 100,000.
Section 2 Static Ingest Configuration Values in this section should be left intact unless being advised to change to address a specific issue or situation by the vendor. The details are included below for informational purposes only. For more information, please contact Support. maxextriesusercache To ensure that look ups against the database are as efficient as possible, Users are cached in local cache. This setting is the maximum number of Users that will be cached. maxentriesprojectcache To ensure that look ups against the database are as efficient as possible, Projects are cached in local cache. This setting is the maximum number of Projects that will be cached. maxentriesactioncache To ensure that look ups against the database are as efficient as possible, Actions are cached in local cache. This setting is the maximum number of Actions that will be cached. maxentriesclientcache To ensure that look ups against the database are as efficient as possible, Clients are cached in local cache. This setting is the maximum number of Clients that will be cached. maxentriesipcache To ensure that look ups against the database are as efficient as possible, IPs are cached in local cache. This setting is the maximum number of IPs that will be cached. maxentriesfoldercache To ensure that look ups against the database are as efficient as possible, Folders are cached in local cache. This setting is the maximum number of Folders that will be cached. cacheupdateconcurrency This value sets the number of processes that are used to create the caches. The default of this setting is 1. This should remain as 1 as setting this to anything other than 1 could result in collisions within the cache tables.
Section 3 Index Generation Configuration indexmemory This value is the amount of memory that will be designated for the JVM that is used to create the indexes. Increasing this could result in performance degradation in other areas of the platform. reportservers The default value of this setting is localhost. In the event that there are additional reporting servers in the environment, they should be listed here in the format of reportserver1,reportserver2,reportserver3. investigatorpath This value is the final location where an index is copied into on all reporting servers listed in the reportservers variable. This must match the value contained in the investigator.yml file located on the reporting servers. templuceneindevpath The temporary location where index files are stored prior to copying them to their final locations on the reporting servers.
Appendix 2: Structured Log Configuration This document explains how to configure the system to accept data from a Perforce installation where the system is configured for structured logs. Configuration Open the configuration file of the server that is enabled for data ingest, usually located at /opt/interset/analytics: For version 2.1, the configuration file name is ingest.conf For version 2.2 and higher, the configuration file name is interset.conf 1. Set scmtype to repository. 2. Ensure that repoformat is set as follows: _,_,_,TIMESTAMP,_,_,USER,_,_,CLIENT_IP,_,_,_,ACTION,PROJECT[1-5],_ 3. Set the ingestfolder to the location where the Perforce logs are located. The format string values PROJECT1, PROJECT2, PROJECT5 specify that the contents of that column should be changed into project names of the specified depth. For example, if the column contained //depot/empire/3.4/blueprints/deathstar.png then, PROJECT2 would generate the project name depot/empire PROJECT4 would generate the project name depot/empire/3.4/blueprints
Example The following is a sample configuration file: #--------------------------------------------------------------------------------- # Dynamic Ingest Configuration - Changes to any of the below will be picked # up on the fly #--------------------------------------------------------------------------------- #mode = daemon runonce mode = daemon #scmtype = perforce repository scmtype = repository # repoformat required columns: TIMESTAMP, USER, PROJECT, ACTION # repoformat optional columns: CLIENT_IP, SIZE # Ignore fields with '_' repoformat = _,_,_,TIMESTAMP,_,_,USER,_,_,CLIENT_IP,_,_,_,ACTION,PROJECT3,_ ingestfolder = /tmp/ingest ingestingfolder = /tmp/ingest/ingesting ingestedfolder = /tmp/ingest/ingested ingesterrorfolder = /tmp/ingest/ingesterror lastmodifiedthreshold = 60000 folderscansinterval = 1200000 #p4projectdepth = 1 tenantid = 1 zkphoenix = localhost tablename = SE batchsize = 100000 Note: The p4projectdepth setting is only used in conjunction with the scmtype of perforce, which is used when ingesting the historical Perforce audit log format.