Actian Analytics Platform Express Hadoop SQL Edition 2.0

Size: px
Start display at page:

Download "Actian Analytics Platform Express Hadoop SQL Edition 2.0"

Transcription

1 Actian Analytics Platform Express Hadoop SQL Edition 2.0 Tutorial AH-2-TU-05

2 This Documentation is for the end user's informational purposes only and may be subject to change or withdrawal by Actian Corporation ("Actian") at any time. This Documentation is the proprietary information of Actian and is protected by the copyright laws of the United States and international treaties. It is not distributed under a GPL license. You may make printed or electronic copies of this Documentation provided that such copies are for your own internal use and all Actian copyright notices and legends are affixed to each reproduced copy. You may publish or distribute this document, in whole or in part, so long as the document remains unchanged and is disseminated with the applicable Actian software. Any such publication or distribution must be in the same manner and medium as that used by Actian, e.g., electronic download via website with the software or on a CD- ROM. Any other use, such as any dissemination of printed copies or use of this documentation, in whole or in part, in another publication, requires the prior written consent from an authorized representative of Actian. To the extent permitted by applicable law, ACTIAN PROVIDES THIS DOCUMENTATION "AS IS" WITHOUT WARRANTY OF ANY KIND, INCLUDING WITHOUT LIMITATION, ANY IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NONINFRINGEMENT. IN NO EVENT WILL ACTIAN BE LIABLE TO THE END USER OR ANY THIRD PARTY FOR ANY LOSS OR DAMAGE, DIRECT OR INDIRECT, FROM THE USER OF THIS DOCUMENTATION, INCLUDING WITHOUT LIMITATION, LOST PROFITS, BUSINESS INTERRUPTION, GOODWILL, OR LOST DATA, EVEN IF ACTIAN IS EXPRESSLY ADVISED OF SUCH LOSS OR DAMAGE. The manufacturer of this Documentation is Actian Corporation. For government users, the Documentation is delivered with "Restricted Rights" as set forth in 48 C.F.R. Section , 48 C.F.R. Sections (c)(1) and (2) or DFARS Section or applicable successor provisions. Copyright 2014 Actian Corporation. All Rights Reserved. Actian, Actian Analytics Platform, Actian DataFlow, Actian Analytics Database Vector Edition, Actian Director, Cloud Action Platform, Cloud Action Server, Action Server, Ingres, Vectorwise, OpenROAD, Enterprise Access, and EDBC are trademarks or registered trademarks of Actian Corporation. All other trademarks, trade names, service marks, and logos referenced herein belong to their respective companies.

3 Contents Setup 7 Components of Actian Analytics Platform - Express Hadoop SQL Edition... 7 Actian DataFlow (KNIME)... 7 Actian Vector - Hadoop Edition... 8 Actian Director... 8 What You Will Learn... 8 Requirements... 9 Access the Actian Vector Instance... 9 Sample Database Data Files Used in the Tutorial How to Run SQL in this Tutorial Create Tables in the Sample Database Lesson 1 Create a Simple ETL Workflow 15 Workflows The Sales Data Input File sales_fact Table Lesson 1 Step 1: Create a New Workflow Project Lesson 1 Step 2: Add a Text Reader Node Add a Text Reader Node Test the Node Reset Node Status Lesson 1 Step 3: Add a Load Actian Vector On Hadoop Node Add a Load Actian Vector On Hadoop Node Configure the Load Actian Vector On Hadoop Node Lesson 1 Step 4: Run the Workflow Execute the Workflow Save the Workflow Optimize the Database Lesson 1 Step 5: Query the Data Verify Data Loaded into sales_facts Table Determine the Profit at Each Store for Each Product Lesson 2 Load Multiple Data Files 29 Lesson 2 Step 1: Use Wildcards in the Text Reader Node Open the ETL Workflow Project Configure Text Reader to Load Multiple Files Contents iii

4 Run the Amended Workflow Lesson 2 Step 2: Query the Data Verify Data Loaded into sales_fact Table Determine the Profit at Each Store for Each Product Lesson 3 Derive a Field and Add Data Lookup 33 sales_city_fact Table Lesson 3 Step 1: Add a Derive Fields Node Remove Connection Between the Text Reader and Load Actian Vector On Hadoop Nodes Add a Derive Fields Node Lesson 3 Step 2: Add Text Reader Node Add a Text Reader Node Lesson 3 Step 3: Add a Join Node Add a Join Node Configure the Join Node Lesson 3 Step 4: Reconfigure the Load Actian Vector On Hadoop Node Reconfigure the Load Actian Vector On Hadoop Node Lesson 3 Step 5: Run the Workflow Run the Workflow Save the Workflow Lesson 3 Step 6: Query the Data Verify Data Loaded into sales_city_fact Table Answer the BI Questions Lesson 4 Workflow Deployment 47 DataFlow Executor Execute in Batch Mode Understanding Batch Mode Command Parameters Running the SIMPLE ETL Workflow in Batch Mode KNIME Server Scheduling Lesson 5 Connect to Actian Vector through ODBC 53 Install Actian ODBC Driver Create an Actian Vector DSN Use Excel to Query Data Lesson 6 Connect to Actian Vector through JDBC 59 JDBC Connection URL iv Tutorial

5 Run the Example Java Program View JDBC Connection URL in Actian Director FAQ 63 Appendix 65 Installing an X Server Connect to the Host as Actian User Installing Actian Director on Windows Contents v

6

7 Setup This chapter describes the requirements for working through the tutorial, including: Required software How to run SQL How to create the required tables in the database Components of Actian Analytics Platform - Express Hadoop SQL Edition Actian Analytics Platform - Express Hadoop SQL Edition consists of the following components: Actian DataFlow (KNIME) Actian Vector Hadoop Edition Actian Director Actian DataFlow (KNIME) Actian DataFlow (KNIME) is a user-friendly graphical workbench that can be used for creating ETL (Extraction, Transformation, and Loading) workflows. Express Hadoop SQL Edition contains the full set of Actian DataFlow operators for use within KNIME. These operators let you create nodes (modules) for: Extracting data from files and databases Executing SQL statements Controlling flow Deriving values Performing predictive analysis Visualizing Reporting The Actian DataFlow engine provides scalability by using all the server cores and cluster nodes that are available at runtime. With Actian DataFlow, a workflow created on a development server can be deployed for execution onto a production server without any change. The Actian DataFlow operators will automatically use the extra resources (such as cores and memory) available on the production server. Setup 7

8 What You Will Learn Actian Vector - Hadoop Edition Actian Vector - Hadoop Edition scales out the raw single machine performance of Actian Analytics Database - Vector Edition by leveraging Hadoop Distributed File System (HDFS) for storage. Actian Vector is a database management system for analytical database applications such as data warehousing, data mining, and reporting. Vector is optimized to work with both memory- and disk-resident datasets, allowing it to efficiently process large amounts of data. Its innovative technology allows analytical queries to run fast. The Hadoop Edition provides Vector performance on every node, SQL access with regular tools, and full analytical capabilities. Actian Director Actian Director is an easy-to-use graphical interface that lets you interact with Actian Vector installations. Using Director, you can: Manage databases, tables, servers, and their components Administer security (users, groups, roles, and profiles) Create, store, and execute queries What You Will Learn In this tutorial, you will learn how to use Actian DataFlow operators to create a simple ETL workflow that loads sales details into an Actian Vector database. After the simple ETL workflow has been executed, you will be able to answer the following BI questions: How much profit is generated by each product at each store? What is the total profit generated by all the stores in a city? What is the total profit generated per country? Specifically, you will learn how to use Actian DataFlow operators to: Extract the sales information from multiple CSV files. Load the sales details into an Actian Vector database. As part of the workflow, derive a value for the profit for each product, because this value is not included in the CSV file. As part of the workflow, add the city and country where each store is located. 8 Tutorial

9 Requirements The simple ETL workflow will be created using the following Actian DataFlow operators: Delimited Text Reader Load Actian Vector On Hadoop Derive Fields Join Requirements This tutorial is intended for users familiar with database fundamentals but new to Actian DataFlow and Actian Vector. The following must be available to run the tutorial: Actian Analytics Platform - Express Hadoop SQL Edition 2.0 (For installation instructions, see the readme) X11 forwarding over SSH enabled If accessing the remote Linux node from your desktop, you must have an X11 server and an SSH client to work with Actian DataFlow (KNIME) and Actian Director from the remote instance. Instructions for installing an X server and SSH client on a Windows PC can be found at Installing an X Server (see page 65). Actian also provides a Windows native client package for Actian Director that can be installed and used for this tutorial. If you want to install Actian Director on your Windows desktop and use it to connect to your remote Linux node, follow the installation instructions for Actian Director on Windows (see page 66). You still need to have an X server running on your desktop to run Actian DataFlow (KNIME) if you are connecting remotely to a Linux node using SSH. Access the Actian Vector Instance During installation of Express Hadoop SQL Edition, an environment file named.ingahsh or.ingahcsh is written to the home directory ($HOME) of the "actian" user, which was created during installation: To access your instance, you must source this environment file as user actian. To source the environment file Issue the following command: source ~actian/.ingahsh Setup 9

10 Sample Database Sample Database The simple ETL workflow will load data into a table in a database. The database for this tutorial was created as part of the Express Hadoop SQL Edition install process. The database is named sample. Data Files Used in the Tutorial The three data files used in the tutorial are included in the distribution: sample_sales_data_1.csv sample_sales_data_2.csv store_city.csv After installing Express Hadoop SQL Edition, you can find these files in this location: /opt/actian/analyticsplatformah/ingres/tutorial How to Run SQL in this Tutorial You will need to run SQL several times during the tutorial. Actian Director is an easy-to-use graphical interface that lets you interact with Actian Vector installations and execute SQL statements. There are two ways to work with Actian Director. You can either start Director on the Linux node where you installed Express (and have the Director user interface display on your desktop) or connect to the remote node with a Director installed natively on your Windows desktop. To start Actian Director and connect to instance If you want to start Director from the Linux node 1. Log in (or switch user) as the actian user. For specifics on connecting based on your scenario, see Connect to the Host as Actian User (see page 65). 2. Start Actian Director by entering the following command: director Director is started. The Actian Vector Hadoop Edition AH instance is displayed in the Instance Explorer. 10 Tutorial

11 How to Run SQL in this Tutorial 3. Right-click the Actian Vector Hadoop Edition AH instance in the Instance Explorer, and then select Connect. The Connect to Instance dialog opens. 4. Enter the following credentials, and then click Connect: Authentication: Authenticated User Login: demo Password: hsedemo You are connected to the instance. If you want to start Director on your Windows PC and use it to connect to the Linux node 1. If you have not yet installed Actian Director on your Windows desktop, follow the installation instructions (see page 66). 2. Start Actian Director from the Start Menu. 3. Click "Connect to an instance" in the Start Page, or click Connection, Connect on the toolbar. The Connect to instance dialog is displayed. Setup 11

12 How to Run SQL in this Tutorial 4. Enter the following credentials, and then click Connect: Instance: Name of the Linux remote host where you have installed Express. Authentication: Authenticated User Login: demo Password: hsedemo You are connected to the instance. To run SQL from Director 1. Expand the Actian Vector Hadoop Edition AH instance in the Instance Explorer. Select Databases, sample. On the menu ribbon, select Query, New. A Query tab is opened in the right pane. 2. Type in your SQL statements, and then press Execute on the query tool bar. The SQL statements are executed. (Alternatively, you can select Query, Open to select a file containing SQL.) 12 Tutorial

13 Create Tables in the Sample Database Create Tables in the Sample Database Two tables must be created in the sample database: sales_fact (used in Lesson 1) sales_city_fact (used in Lesson 3) SQL scripts are provided that create the tables for you. To create the sales_fact and the sales_city_fact tables 1. Connect to the sample database using Actian Director. For details, see How to Run SQL in this Tutorial (see page 10). 2. Select Query, Open on the menu ribbon. In the Browse dialog, navigate to /opt/actian/analyticsplatformah/ingres/tutorial, and then select the following file: create_sales_fact_table.txt The SQL to create the table is displayed in a new tab in the right pane. 3. Select the sample database from the drop-down list on the query toolbar. 4. Click Execute on the toolbar. The table is created. 5. Repeat Steps 1 through 4, except this time for Step 2 select the following file: create_sales_city_fact_table.txt 6. (Optional) Select Home, Refresh on the menu ribbon. The newly created tables are listed in the Instance Explorer under: Actian Vector Hadoop Edition AH, Databases, sample, Tables. Setup 13

14

15 Lesson 1 Create a Simple ETL Workflow In this lesson you will create a simple ETL workflow that extracts data from a flat file and loads the contents of the file into the sales_fact table in Actian Vector. After the simple ETL workflow has been executed, you will be able to answer the following BI question: How much profit is generated by each product at each store? Workflows A workflow consists of a set of nodes (also known as operators) that operate on a data set. The input data set for the workflow is created either by loading a flat file (using the nodes Text Reader, Log Reader, etc.) or reading data from a table (using the nodes Database Reader, HBase Reader, etc.). Nodes are connected together so that the output from a node is the input of subsequent nodes. By connecting nodes together a workflow is created that transforms the input data. Once the transformation has been completed, the data set can be saved as a flat file (Text Writer node) or loaded into a database table (Load Actian Vector On Hadoop node). Note: KNIME refers to operators as nodes. Data sets can be joined together to create a larger data set. Joining data sets is covered in Lesson 3 (see page 33). Also, the output from one node can be used as the input to multiple nodes. Lesson 1 Create a Simple ETL Workflow 15

16 The Sales Data Input File The Sales Data Input File For this tutorial, the source input data file is sample_sales_data_1.csv, found in the subdirectory named tutorial. This is a comma separated file containing sales information with the following fields: product_id time_id customer_id promotion_id store_id store_sales store_cost unit_sales These are the first four rows in the file: product_id,time_id,customer_id, promotion_id,store_id,store_sales,store_cost,unit_sales 337,371,6280,0,2,1.5,0.51,2 1512,371,6280,0,2,1.62,0.632,3 963,371,4018,0,2,2.4,0.72,1 Notes: The source file has a header row. The Text Reader node analyzes the contents of the input source file to determine the data types for each field. Additionally, if the file has a header row, these field names will be used as the column names in the data set created by the Text Reader node. The data set column names are then available for use by the Load Actian Vector On Hadoop node when mapping the data set to the columns in the destination table. The input file has no value for profit. You will initially use SQL to derive a value for profit, but in Lesson 3 (see page 33) you will amend the workflow to include a node to derive the profit for each product. If the source file does not have a header row, then the Text Reader node will create generic column names. You can either inject column names after the file has been read using the Insert Column Header node, or map the generic column names when configuring the Load Actian Vector On Hadoop node. 16 Tutorial

17 sales_fact Table sales_fact Table The sales data from the input source files will be loaded into the sales_fact table in the sample database. If you did not create the sales_fact table earlier (as described in Create Tables in the Sample Database (see page 13)), then use the following SQL to create it: CREATE TABLE sales_fact( product_id INTEGER NOT NULL, time_id INTEGER NOT NULL, customer_id INTEGER NOT NULL, promotion_id INTEGER NOT NULL, store_id INTEGER NOT NULL, store_sales DECIMAL(10,4) NOT NULL, store_cost DECIMAL(10,4) NOT NULL, unit_sales INTEGER NOT NULL ); WITH PARTITION = (HASH ON product_id <numparts> PARTITIONS); COMMIT; Note: In the WITH PARTITION clause, numparts is the number of partitions to be created for the data. This number should be a multiple of the total number of physical cores across all nodes where Vector Hadoop Edition is installed. Note: If you want other users to access the tables in this Tutorial, you may have to grant privileges to those users. You can do so by using Actian Director. Lesson 1 Create a Simple ETL Workflow 17

18 Lesson 1 Step 1: Create a New Workflow Project Lesson 1 Step 1: Create a New Workflow Project The first step in creating a new workflow is to create a new Workflow project. The project includes the Workflow Credentials, Workflow Variables, Work Preferences, and workflow canvas onto which the workflow nodes will be placed that will transform the data. To create a new workflow project 1. Connect as user actian to the Linux master node on which you installed Express. For specifics on connecting based on your scenario, see Connect to the Host as Actian User (see page 65). 2. Start Actian DataFlow (KNIME) by entering knime. 3. On the File menu, select New. The New dialog is displayed, which lets you select a wizard. 4. Select Actian Dataflow Workflow, and then click Next. The Create a New Actian Dataflow Workflow dialog is displayed. 5. Change the default workflow name to SimpleETL, leave the Workflow Group set to the default value, leave the profile set as Development, and then click Finish. A new tab titled SimpleETL is created. Note: Several different templates can be used when creating a new workflow project. The Actian DataFlow Workflow template must be used if the workflow will include any Actian DataFlow nodes. Using the Actian DataFlow Workflow template allows for streaming execution of the nodes, which allows for much faster execution of workflows. The data is streamed from node to node without having to be staged to disk. Also, the workflow nodes can be run in parallel on a single machine or even on a Hadoop cluster. For more details, see the Actian DataFlow Online Help pages ( h+datarush). Lesson 1 Step 2: Add a Text Reader Node In this step, you will add and configure the Actian DataFlow Text Reader node to the workflow. The Text Reader node extracts data from a file and creates a data set ready to be used by subsequent nodes in the workflow. 18 Tutorial

19 Lesson 1 Step 2: Add a Text Reader Node Add a Text Reader Node To configure a Text Reader you specify the location of the file, the field separator, and whether the file has a header row. To add a new Text Reader node to the workflow 1. In the Node Repository expand the following nodes by clicking on their expand arrows: Actian Dataflow, I/O, Read 2. Select the Delimited Text Reader node and drag it onto the workflow canvas. 3. Right-click the Delimited Text Reader node and select Configure from the context menu (or simply double-click the node). The Reader Properties tab for the Delimited Text Reader is displayed. 4. Click Browse. The Open dialog box is displayed. 5. Locate the file sample_sales_data_1.csv in the following directory, and then click Open. /opt/actian/analyticsplatformah/ingres/tutorial The file name is then displayed in the Location field on the Reader Properties tab. 6. Leave the Field separator set to the default value of comma. 7. Select the "Has header row" check box, and then click OK. The Actian DataFlow Text Reader node analyzes the text file to determine the data type for each field. Because the input text file has a header row, the field names from this row will be used as the column names in the data set created by the Text Reader. Node Status Indicator While the Text Reader node has several advanced features (including Operator Settings, Flow Variables, Job Manager Selection, Memory Policy, Read a file from a cluster), only the Job Manager Selection is covered in this tutorial. Under the Text Reader node there is a traffic light status indicator. Lesson 1 Create a Simple ETL Workflow 19

20 Lesson 1 Step 2: Add a Text Reader Node This status indicator is present on all nodes and signifies the following: failed. (Red) The node has not been configured or execution of the node (Amber) The node has been configured, but not run. This status is also set if this or a preceding node in the workflow has been reset and this node needs to be rerun. (Green) The node has been configured and execution completed successfully. Within a workflow, a node with a status of green will not be rerun because nothing has changed. To force the execution of a node with a status of green requires the node to be RESET. Test the Node To test that the Text Reader node has been configured successfully Click Execute All on the toolbar (or select Node, Execute All). Execution of the workflow will start. If the Text Reader node has been successfully configured, then the status indicator will show green. 20 Tutorial

21 Lesson 1 Step 3: Add a Load Actian Vector On Hadoop Node Reset Node Status After a node has been run successfully, the node will not be rerun unless the node is reconfigured or a preceding node in the workflow is changed. To force the execution of a node that has been run successfully, the node status must be reset. To reset the node status Right-click the Delimited Text Reader node and select Reset from the context menu. The node is reset. Lesson 1 Step 3: Add a Load Actian Vector On Hadoop Node In this step, you will add and configure the Actian DataFlow Load Actian Vector On Hadoop node to the workflow. The Load Actian Vector On Hadoop node loads data into a database table using the data set created by a preceding node in the workflow. The process for setting up the Load Actian Vector On Hadoop node is summarized as follows: 1. Create a Database Connection to Actian Vector. Note: A connection named Actian_Vector_AH was created during installation. 2. Connect the input of the Load Actian Vector On Hadoop node to the output from a preceding node. This must be done before you can configure Load Actian Vector On Hadoop node. 3. Configure the Load Actian Vector On Hadoop node with details of the database connection to use, the destination table name, and then map the input data set columns to the destination table columns. Lesson 1 Create a Simple ETL Workflow 21

22 Lesson 1 Step 3: Add a Load Actian Vector On Hadoop Node 4. Provide the following SQL: Initialization SQL This will be executed before the data is loaded to the table. For example, delete existing rows in the table before loading in new data. Finalization SQL This will be executed after the data has been loaded to the table. There are two nodes that can be used to load data into a Vector Hadoop Edition database: Load Actian Vector On Hadoop (Used in this tutorial) This node formats the input data for loading using the cluster-enabled vwload utility. If executing the node from a remote client, the client must have access to the HDFS instance on which Vector Hadoop Edition is running. Database Writer This node uses JDBC to load the data into the database table. This option is slower than using the Load Actian Vector On Hadoop node. Add a Load Actian Vector On Hadoop Node A Load Actian Vector On Hadoop node will load a data set into a database table. To add the Load Actian Vector On Hadoop node to the workflow 1. In the Node Repository expand the following tree nodes by clicking on their expand arrows: Actian Dataflow, I/O, Write 2. Select the Load Actian Vector On Hadoop node and drag it onto the workflow canvas. 3. Connect the output from the Text Reader node to the input of the Load Actian Vector On Hadoop node: a. Click and hold the output arrow on the Text Reader node. b. Drag the mouse to the input arrow on Load Actian Vector On Hadoop. 22 Tutorial

23 Lesson 1 Step 3: Add a Load Actian Vector On Hadoop Node Configure the Load Actian Vector On Hadoop Node The final step in adding the Actian Vector On Hadoop Node is to configure it. To configure the Load Actian Vector On Hadoop ode 1. Right-click the Load Actian Vector On Hadoop node and select Configure. 2. On the Load Actian Vector On Hadoop dialog: a. For Connection, select the Actian_Vector_AH database connection. b. For Table Name, select sales_fact. c. For Temporary Directory, select hdfs://localhost:8020/actian/tmp. Note: You must provide a temporary directory in the HDFS instance on which Vector Hadoop Edition is installed. The actual HDFS URL may be different depending on your Hadoop distribution and configuration. To determine the correct Temporary Directory value, run the following command to determine the default URL for your Hadoop instance, and then append '/Actian/tmp' to the result: hdfs getconf -confkey fs.defaultfs d. Click Map Fields. The Source to Target Mapper dialog is displayed. e. Click Map by name, and then click OK. 3. Click the Initialization SQL tab and enter the following SQL. The tables were created for the "demo" user. MODIFY demo.sales_fact TO TRUNCATED Any existing data will be deleted from the sales_fact table. This is required because you will execute the Load Actian Vector On Hadoop node many times as part of this tutorial. 4. Click OK. The Load Actian Vector On Hadoop dialog is closed. Lesson 1 Create a Simple ETL Workflow 23

24 Lesson 1 Step 4: Run the Workflow Lesson 1 Step 4: Run the Workflow You have created a simple ETL workflow that extracts data from a flat file and loads the data into an Actian Vector table. Execute the Workflow To verify that the nodes have been correctly configured, execute the workflow. To execute the workflow Click Execute All on the toolbar (or select Node, Execute All). Execution of the workflow starts. If the nodes have been successfully configured then the status indicator for both nodes will show green. Save the Workflow In a later lesson in this tutorial you will amend this ETL workflow. Therefore, you may wish to save the workflow before moving on to the next lesson. To save the simple ETL workflow On the File menu, select Save. The workflow is saved. 24 Tutorial

25 Lesson 1 Step 5: Query the Data Optimize the Database After loading data, you should optimize the database. Optimizing the database generates statistics that tell the query optimizer what the data looks like. The query optimizer uses the statistics to generate a query execution plan (QEP) that shows how your query is executed. The QEP can be reused to execute the same query. To optimize the tables in the sample database 1. Start Actian Director and connect to the Actian Vector Hadoop Edition AH instance. (For instructions, see How to Run SQL in this Tutorial (see page 10)). 2. Select the sample database from the Instance Explorer. 3. Select Database, Generate Statistics on the menu ribbon. The Generate Statistics dialog is displayed. 4. Accept the default settings, and then click OK. The optimizedb utility generates the statistics and issues the message: "Execution of the utility has completed." 5. Click Close. Note: While this step is not repeated in the tutorial, it is a good practice to optimize the database after each data load. Lesson 1 Step 5: Query the Data You have created and executed a simple ETL workflow that extracts data from a flat file and loads the data into an Actian Vector database table. In this step, you will verify that the data has been loaded and run SQL to identify the profitability of each product at all stores. Lesson 1 Create a Simple ETL Workflow 25

26 Lesson 1 Step 5: Query the Data Verify Data Loaded into sales_facts Table Running a query will verify that all the data from the flat file has been successfully loaded into the sales_fact table. To verify that the data was successfully loaded 1. Connect to the Actian Vector Hadoop Edition AH instance using Actian Director. 2. Select the sample database from the Instance Explorer. 3. Select Query, New on the menu ribbon. A query tab is opened. 4. Type the following SQL into the query document: SELECT COUNT(*) FROM sales_fact; COMMIT; 5. Click Execute on the query toolbar. The query should return 55,063 rows. Determine the Profit at Each Store for Each Product To determine the profitability of each product at each store, we must calculate the profit of each product because this value is not included in the input data file. Note: Instead of calculating the profit for each product using SQL, this value could have been derived as part of the ETL workflow. How to add a derived field into a workflow is covered in Lesson 3 Derive a Field and Add Data Lookup (see page 33). To determine the profitability of each product Run the following SQL: SELECT store_id, product_id, SUM(store_sales - ( store_cost * unit_sales)) AS total_profit, SUM(unit_sales) AS total_units_sold FROM sales_fact GROUP BY store_id, product_id ORDER BY store_id, total_profit DESC; COMMIT; 26 Tutorial

27 Lesson 1 Step 5: Query the Data Note: For instructions, see How to Run SQL in this Tutorial (see page 10). Lesson 1 Create a Simple ETL Workflow 27

28

29 Lesson 2 Load Multiple Data Files In Lesson 1 (see page 15) you created a simple ETL workflow that loaded the contents of the sample_sales_data_1.csv file into a data table. But what if you need to load multiple files, all with the same file structure? What changes need to be made to the workflow? In this lesson you will change the Delimited Text Reader node to process multiple files (with the same structure) by using wildcards. The files are named as follows: sample_sales_data_1.csv This file contains 55,064 rows, with one row being a header row. sample_sales_data_2.csv This file contains 31,775 rows, with one row being a header row. In total you will load 86,837 rows into the sales_fact table. Note: If each file had a different structure you would have to: Use a Text Reader node for each file. Use a Join to merge the data sets together. In Lesson 3 (see page 33) you will learn how to merge data from two files using a Join node. Lesson 2 Step 1: Use Wildcards in the Text Reader Node By using a wildcard in the file name, the Text Reader node will extract data from all the matching file names. Lesson 2 Load Multiple Data Files 29

30 Lesson 2 Step 1: Use Wildcards in the Text Reader Node Open the ETL Workflow Project If you saved the workflow created in Lesson 1 (see page 15), you need to open the simple ETL workflow. To open the simple ETL workflow that you created in Lesson 1 1. On the KNIME Explorer panel, expand the LOCAL (Local Workspace) tree node. 2. Double-click the simple ETL workflow named SimpleETL. The workflow is opened. Configure Text Reader to Load Multiple Files The Text Reader allows for wildcards to be used in the file name. If a wildcard is used then the Text Reader node will extract data from all the matching file names before passing the data set to subsequent nodes. To use wildcards with the Text Reader node 1. Right-click the Delimited Text Reader node and select Configure. 2. Click the Location entry field and navigate to the end of the file name. 3. Add a wildcard to the file name by changing sample_sales_data_1.csv to sample_sales_data_*.csv. 4. Tab out of the Location field. The contents of the files are analyzed. 5. Click OK. The Confirm reset message is displayed. 6. Click OK. The status indicator on the Text Reader node is reset to amber to signify that there has been a change and execution of the node is required. The status of the Load Actian Vector On Hadoop node is also reset to amber because this node needs to be rerun as its preceding node has been reset. 30 Tutorial

31 Lesson 2 Step 2: Query the Data Run the Amended Workflow To test that the Text Reader node will load multiple files, run the amended workflow. To run the amended workflow Click Execute All on the toolbar (or select Node, Execute All). By adding a wildcard to the filename, you have loaded the content of multiple files. A total of 86,837 lines are loaded into the sales_fact table. Lesson 2 Step 2: Query the Data You have changed the simple ETL workflow to extract data from multiple flat files and loaded the data into an Actian Vector table. In this step, you will verify that the data has been loaded successfully. Verify Data Loaded into sales_fact Table Running a query will verify that all the data from the flat files have been successfully loaded into the sales_fact table. To verify that the data was successfully loaded Run the following SQL: SELECT COUNT(*) FROM sales_fact; COMMIT; The query should return 86,837 rows. Note: For instructions, see How to Run SQL in this Tutorial (see page 10). Lesson 2 Load Multiple Data Files 31

32 Lesson 2 Step 2: Query the Data Determine the Profit at Each Store for Each Product To identify the profitability of each product at each store we must derive a value for the profit of each product because this value is not included in the data file. Note: Instead of deriving the profit for each product as part of the SQL, this value can be derived as part of the ETL workflow by including a node to calculate the profit value and inject this to the data set before it is loaded into the database table. How to add a derived field into a workflow is covered in Lesson 3 Derive a Field and Add Data Lookup (see page 33). To determine the profit for each product Run the following SQL (the same SQL as Lesson 1): SELECT store_id, product_id, SUM(store_sales - ( store_cost * unit_sales)) AS total_profit, SUM(unit_sales) AS total_units_sold FROM sales_fact GROUP BY store_id, product_id ORDER BY store_id, total_profit DESC; COMMIT; 32 Tutorial

33 Lesson 3 Derive a Field and Add Data Lookup In Lesson 2 (see page 29) you created a simple ETL workflow that loaded the contents of multiple files into a data table. But we had to use SQL to calculate the profit for each product because the value for profit is not included in the input data files. As part of an ETL workflow, a transformation of the data may be required before the data is loaded into a database. In this lesson, you add nodes to: Calculate a derived field (profit for a product at a store) Perform a lookup transformation by adding the city and country where the store is located After these nodes have been added and the workflow successfully run, you will be able to answer the following BI questions: What is the total profit generated by all the stores in a city? What is the total profit generated per country? Lesson 3 Derive a Field and Add Data Lookup 33

34 sales_city_fact Table sales_city_fact Table Although the input files for this lesson remain the same, adding a derived field and lookup transformation mean that we must use the sales_city_fact table, rather than the sales_fact table, for this lesson. If you did not create the sales_city_fact table earlier (as described in Create Tables in the Sample Database (see page 13)), then use the following SQL to create it: CREATE TABLE sales_city_fact( product_id INTEGER NOT NULL, time_id INTEGER NOT NULL, customer_id INTEGER NOT NULL, promotion_id INTEGER NOT NULL, store_id INTEGER NOT NULL, store_sales DECIMAL(10,4) NOT NULL, store_cost DECIMAL(10,4) not NOT NULL, unit_sales INTEGER NOT NULL, profit DECIMAL(10,4) NOT NULL, city VARCHAR(25) NOT NULL, country VARCHAR(25) NOT NULL ) WITH PARTITION = (HASH ON product_id <numparts> PARTITIONS); COMMIT; Note: In the WITH PARTITION clause, numparts is the number of partitions to be created for the data. This number should be a multiple of the total number of physical cores across all nodes where Vector Hadoop Edition is installed. The sales_city_fact table has these additional columns: profit, which will hold the derived profit value city, which will hold the city where the store is located country, which will hold the country where the store is located 34 Tutorial

35 Lesson 3 Step 1: Add a Derive Fields Node Lesson 3 Step 1: Add a Derive Fields Node In this step you will amend the Simple ETL workflow that you created as part of Lesson 2 to include a node to derive a field. Remove Connection Between the Text Reader and Load Actian Vector On Hadoop Nodes Because new nodes will be added into the workflow after the sales data files have been loaded, the connection between the output of the Text Reader and the input of the Load Actian Vector On Hadoop nodes needs to be removed. To remove the connection 1. Right-click the link (black line) between the two nodes and select Delete. The Confirm dialog is displayed. 2. Click OK. The connection is deleted and the status of the Load Actian Vector On Hadoop node changes to red. Lesson 3 Derive a Field and Add Data Lookup 35

36 Lesson 3 Step 1: Add a Derive Fields Node Add a Derive Fields Node To add a new Derive Fields node to the workflow 1. In the Node Repository expand the following tree nodes by clicking on their expand arrows: Actian Dataflow, Transformations, Manipulation 2. Select the Derive Fields node and drag it onto the workflow canvas. 3. Connect the output from the Text Reader node to the input of the Derive Fields node: Click and hold the output arrow on the Text Reader node. Drag the mouse to the input arrow on the Derive Fields node. Note: The Text Reader and Derive Fields nodes must be connected together before the Derive Fields node can be configured because the column names used by the Derive Fields node are taken from the input data set. 4. Right-click the Derive Fields node and select Configure (or simply doubleclick the node). 5. Click Add. 6. Double-click the Output Field column and replace default value of field0 with profit. 7. Double-click the Expression column and replace the default value of 0 with store_sales - ( store_cost * unit_sales). 8. Click OK. The formula is saved. If the expression entered has an error, a red warning triangle is shown next to the expression. You can amend an existing derivation by clicking Edit on the Derived Outputs panel. This also allows you to check the syntax of the expression. 36 Tutorial

37 Lesson 3 Step 2: Add Text Reader Node Lesson 3 Step 2: Add Text Reader Node In this step you will amend the ETL workflow to include a node to add details for the city and country for each store. The city and country for each store are in the store_city.csv file. This is a comma separated file with columns of store_id, city, country. These are the first five rows in this file: store_id, city, country 2,Leeds, United Kingdom 3,Slough, United Kingdom 6,Suresnes,France 7,Islandia, USA As in the sample_sales_data_1.csv file used in Lesson 1 (see page 15) the input source file has a header row. Lesson 3 Derive a Field and Add Data Lookup 37

38 Lesson 3 Step 2: Add Text Reader Node Add a Text Reader Node In Lesson 1 Step 2 you added a Text Reader node to extract data from a file and import into a data set for processing by subsequent nodes. To add a Text Reader node onto the workflow to read the store_city.csv file Repeat the steps from Lesson 1 Step 2, but this time specify the store_city.csv file as the Source. In your workflow, you have now loaded the contents of three flat files (remember that you have loaded two sales data files and one data lookup file). The files have been loaded into separate data sets, which need to be joined together. In the next step in this lesson you will add a Join node to perform an INNER join between the two data sets to add the city and country for each store. 38 Tutorial

39 Lesson 3 Step 3: Add a Join Node Lesson 3 Step 3: Add a Join Node In this step you will join the data sets from the two Text Reader nodes together using a Join node. Add a Join Node A Join node has two inputs (left and right) that can be joined either as INNER, LEFT OUTER, RIGHT OUTER, or FULL OUTER join. For the simple ETL workflow you need an INNER join. To add a new Join node to the workflow 1. In the Node Repository expand the following tree nodes by clicking on their expand arrows: Actian Dataflow, Transformations, Aggregate 2. Select the Join node and drag it onto the workflow canvas. Lesson 3 Derive a Field and Add Data Lookup 39

40 Lesson 3 Step 3: Add a Join Node 3. Connect the output from the Derived Fields node to the top input on the Join node. 4. Connect the output from the Text Reader node (used to read the store_city.csv file) to the bottom input on the Join node. Configure the Join Node You need to configure the Join node to specify what the join should be and what fields to be used for the join. To configure the Join node 1. Right-click the Join node and select Configure (or simply double-click the node). 2. For Join Type select INNER. 3. Select Merge Key Fields. Selecting Merge Key Fields ensures that the output data set from the Join will not contain two store_id columns. 4. Click Add to create a new join row in the Join Source table grid. 5. Select store_id as the Left Key. 6. Select store_id as the Right Key. 7. Click OK. You can specify additional predicates to be used when performing the join, but these are not needed for the simple ETL workflow. 40 Tutorial

41 Lesson 3 Step 4: Reconfigure the Load Actian Vector On Hadoop Node Lesson 3 Step 4: Reconfigure the Load Actian Vector On Hadoop Node In this step you will reconfigure the Load Actian Vector On Hadoop node to load data into the sales_city_fact table. Reconfigure the Load Actian Vector On Hadoop Node To configure the Load Actian Vector On Hadoop node to load into the sales_city_fact table 1. Connect the output from the Join node to the input on the Load Actian Vector On Hadoop node. 2. Right-click the Load Actian Vector On Hadoop node and select Configure (or simply double-click the node). The Load Actian Vector On Hadoop dialog is displayed. a. For Connection, select the Actian_Vector_AH database connection. Lesson 3 Derive a Field and Add Data Lookup 41

42 Lesson 3 Step 4: Reconfigure the Load Actian Vector On Hadoop Node b. For Table Name, select sales_city_fact. c. For Temporary Directory, select hdfs://localhost:8020/actian/tmp. Note: The actual HDFS URL may be different depending on your Hadoop distribution and configuration. To determine the correct Temporary Directory value, run the following command to determine the default URL for your Hadoop instance, and then append '/Actian/tmp' to the result: hdfs getconf -confkey fs.defaultfs 3. Click Map Fields. The Source to Target Mapper dialog is displayed. 4. Click Map by name, and then click OK. 5. Click the Initialization SQL tab and enter the following SQL: MODIFY demo.sales_city_fact TO TRUNCATED Data will be deleted from the sales_city_fact table. Note: When deleting all data in a large table, MODIFY...TO TRUNCATED is preferred to DELETE FROM because MODIFY creates only one entry in the log file, regardless of the number of rows, while DELETE FROM creates one entry per row. A DELETE FROM operation will take much longer to complete due to logging. 6. Click OK. The Load Actian Vector On Hadoop dialog is closed. 42 Tutorial

43 Lesson 3 Step 5: Run the Workflow Lesson 3 Step 5: Run the Workflow You have modified the simple ETL workflow to include transformations to derive a profit field and perform a Data Lookup to add the city and country for each store. Run the Workflow To test that the nodes in the workflow have been configured successfully, we execute the workflow. To run the workflow Click Execute All on the toolbar (or select Node, Execute All). Execution of the workflow starts. The status indicator for all nodes will show green if the Text Reader and Load Actian Vector On Hadoop nodes have been successfully configured. If the execution of the workflow fails, look at the error messages in the Console. If the error mentions the store_id_2 column, then on the Join node you did not select the Merge Key Fields check box. Lesson 3 Derive a Field and Add Data Lookup 43

44 Lesson 3 Step 6: Query the Data Save the Workflow To save the simple ETL workflow On the File menu, select Save. The workflow is saved. Lesson 3 Step 6: Query the Data You have modified the simple ETL workflow to include transformations that derive a profit field and perform a Data Lookup to add the city and country for each store. In this step, you will verify that the data has been loaded and answer the following BI questions: What is the total profit generated by all the stores in a city? What is the total profit generated per country? Verify Data Loaded into sales_city_fact Table Running a query will verify that all the data from the flat file has been successfully loaded into the sales_city_fact table. To verify that the data was successfully loaded Run the following SQL: SELECT COUNT(*) FROM sales_city_fact; COMMIT; The query should return 86,837 rows. Note: For instructions, see How to Run SQL in this Tutorial (see page 10). 44 Tutorial

45 Lesson 3 Step 6: Query the Data Answer the BI Questions To answer the two BI questions Run the following SQL: SELECT city, SUM(profit) AS total_profit, SUM(unit_sales) AS total_units_sold FROM sales_city_fact GROUP BY city ORDER BY total_profit DESC; COMMIT; SELECT country, SUM(profit) AS total_profit, SUM(unit_sales) AS total_units_sold FROM sales_city_fact GROUP BY country ORDER BY total_profit DESC; COMMIT; The results of these two queries are displayed in the Results 1 and Results 2 tabs in Actian Director. Lesson 3 Derive a Field and Add Data Lookup 45

46

47 Lesson 4 Workflow Deployment The simple ETL workflow you have created in Lesson 3 (see page 33) to load store sales data into the Actian Vector data table was executed interactively inside the KNIME workbench. In this lesson you will learn how to execute a workflow outside the KNIME workbench. The options available to run a workflow non-interactively outside the KNIME workbench are: Execute workflow with Dataflow Execute in headless batch node Execute using the KNIME Server DataFlow Executor The Actian DataFlow nodes are designed for scalability and performance. Scalability can be achieved by adding more resources that are available for use. When adding additional resources, scaling can be achieved through: Scaling up Adding more CPUs, more disks, or more memory Scaling out Adding additional machines. This is the typical distributed cluster model, enlarging the cluster size to increase performance. Actian DataFlow is designed to handle scaling in either direction. Applications built using DataFlow will scale equally well on "fat" SMP (symmetric multiprocessor) machines and on clusters of "skinny" commodity PCs. Concurrency and parallelism help to reduce execution times and enable an application to scale as the data volume to process increases. The Actian DataFlow nodes that you will include as part of the Simple ETL workflow can be configured to use the DataFlow scalability by changing the Job Manager settings to use the DataFlow Executor instead of the KNIME engine. Lesson 4 Workflow Deployment 47

48 DataFlow Executor For example, to configure a Text Reader node to use the DataFlow Executor 1. Right-click the Delimited Text Reader node and select Configure. 2. Select the Job Manager Selection tab. 3. Select DataFlow Executor as the job manager. 4. Select development for the Profile. Note: The SimpleETL workflow was created using the Actian "development" profile. You can set up different Actian profiles (such as production or test) that define your environment, including level of parallelism and whether to execute in a cluster. To do so, in KNIME select File, Preferences, Actian The following diagram illustrates the time difference between a workflow executing using the default KNIME job manager (sequential execution of nodes) and the DataFlow engine (parallel execution of nodes). 48 Tutorial

49 Execute in Batch Mode Details on performance and scalability can be found on the DataFlow General Concepts help page ( Details on how to configure a workflow to execute with DataFlow are on the Enabling Workflows to Execute with DataFlow help page ( h+datarush). Execute in Batch Mode KNIME can be run in headless batch mode by specifying parameters and their values on the command line. To see usage information on valid input parameters, CD (change directory) to the directory where KNIME is installed, and run the following command:./knime -nosplash -application org.knime.product.knime_batch_application For more details on running KNIME in batch mode see this KNIME FAQ on tech.knime.org ( Understanding Batch Mode Command Parameters To execute a workflow in batch mode you need to know the following: Full path to the knime executable Full path to the workflow subdirectory (see page 49) in the workspace directory Full path and name of the exported preferences file Workflow Directory When using the KNIME workbench, all workflows are shown in a repository view. All workflows are stored in a subdirectory inside the KNIME workspace directory. For example, the Simple ETL workflow you created in Lesson 3 (see page 33) will be workflow/simpleetl. Note: The KNIME workspace directory was created when you started KNIME for the first time on your machine. Lesson 4 Workflow Deployment 49

Actian Vortex Express 3.0

Actian Vortex Express 3.0 Actian Vortex Express 3.0 Quick Start Guide AH-3-QS-09 This Documentation is for the end user's informational purposes only and may be subject to change or withdrawal by Actian Corporation ("Actian") at

More information

Plug-In for Informatica Guide

Plug-In for Informatica Guide HP Vertica Analytic Database Software Version: 7.0.x Document Release Date: 2/20/2015 Legal Notices Warranty The only warranties for HP products and services are set forth in the express warranty statements

More information

BrightStor ARCserve Backup for Linux

BrightStor ARCserve Backup for Linux BrightStor ARCserve Backup for Linux Agent for MySQL Guide r11.5 D01213-2E This documentation and related computer software program (hereinafter referred to as the "Documentation") is for the end user's

More information

CA APM Cloud Monitor. Scripting Guide. Release 8.2

CA APM Cloud Monitor. Scripting Guide. Release 8.2 CA APM Cloud Monitor Scripting Guide Release 8.2 This Documentation, which includes embedded help systems and electronically distributed materials, (hereinafter referred to as the Documentation ) is for

More information

CA Unified Infrastructure Management Server

CA Unified Infrastructure Management Server CA Unified Infrastructure Management Server CA UIM Server Configuration Guide 8.0 Document Revision History Version Date Changes 8.0 September 2014 Rebranded for UIM 8.0. 7.6 June 2014 No revisions for

More information

BrightStor ARCserve Backup for Windows

BrightStor ARCserve Backup for Windows BrightStor ARCserve Backup for Windows Serverless Backup Option Guide r11.5 D01182-2E This documentation and related computer software program (hereinafter referred to as the "Documentation") is for the

More information

Scribe Online Integration Services (IS) Tutorial

Scribe Online Integration Services (IS) Tutorial Scribe Online Integration Services (IS) Tutorial 7/6/2015 Important Notice No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, photocopying,

More information

BrightStor ARCserve Backup for Windows

BrightStor ARCserve Backup for Windows BrightStor ARCserve Backup for Windows Tape RAID Option Guide r11.5 D01183-1E This documentation and related computer software program (hereinafter referred to as the "Documentation") is for the end user's

More information

BrightStor ARCserve Backup for Windows

BrightStor ARCserve Backup for Windows BrightStor ARCserve Backup for Windows Agent for Microsoft SQL Server r11.5 D01173-2E This documentation and related computer software program (hereinafter referred to as the "Documentation") is for the

More information

Quick Start SAP Sybase IQ 16.0

Quick Start SAP Sybase IQ 16.0 Quick Start SAP Sybase IQ 16.0 UNIX/Linux DOCUMENT ID: DC01687-01-1600-01 LAST REVISED: February 2013 Copyright 2013 by Sybase, Inc. All rights reserved. This publication pertains to Sybase software and

More information

CA Spectrum and CA Service Desk

CA Spectrum and CA Service Desk CA Spectrum and CA Service Desk Integration Guide CA Spectrum 9.4 / CA Service Desk r12 and later This Documentation, which includes embedded help systems and electronically distributed materials, (hereinafter

More information

CA VPN Client. User Guide for Windows 1.0.2.2

CA VPN Client. User Guide for Windows 1.0.2.2 CA VPN Client User Guide for Windows 1.0.2.2 This Documentation, which includes embedded help systems and electronically distributed materials, (hereinafter referred to as the Documentation ) is for your

More information

CA ARCserve Backup for Windows

CA ARCserve Backup for Windows CA ARCserve Backup for Windows Agent for Microsoft SharePoint Server Guide r15 This documentation and any related computer software help programs (hereinafter referred to as the "Documentation") are for

More information

CA Nimsoft Service Desk

CA Nimsoft Service Desk CA Nimsoft Service Desk Single Sign-On Configuration Guide 6.2.6 This Documentation, which includes embedded help systems and electronically distributed materials, (hereinafter referred to as the Documentation

More information

Moving the TRITON Reporting Databases

Moving the TRITON Reporting Databases Moving the TRITON Reporting Databases Topic 50530 Web, Data, and Email Security Versions 7.7.x, 7.8.x Updated 06-Nov-2013 If you need to move your Microsoft SQL Server database to a new location (directory,

More information

CA ARCserve Backup for Windows

CA ARCserve Backup for Windows CA ARCserve Backup for Windows Agent for Sybase Guide r16 This Documentation, which includes embedded help systems and electronically distributed materials, (hereinafter referred to as the Documentation

More information

Setting Up ALERE with Client/Server Data

Setting Up ALERE with Client/Server Data Setting Up ALERE with Client/Server Data TIW Technology, Inc. November 2014 ALERE is a registered trademark of TIW Technology, Inc. The following are registered trademarks or trademarks: FoxPro, SQL Server,

More information

Synthetic Monitoring Scripting Framework. User Guide

Synthetic Monitoring Scripting Framework. User Guide Synthetic Monitoring Scripting Framework User Guide Please direct questions about {Compuware Product} or comments on this document to: APM Customer Support FrontLine Support Login Page: http://go.compuware.com

More information

Unicenter NSM Integration for BMC Remedy. User Guide

Unicenter NSM Integration for BMC Remedy. User Guide Unicenter NSM Integration for BMC Remedy User Guide This documentation and any related computer software help programs (hereinafter referred to as the Documentation ) is for the end user s informational

More information

Oracle Data Integrator for Big Data. Alex Kotopoulis Senior Principal Product Manager

Oracle Data Integrator for Big Data. Alex Kotopoulis Senior Principal Product Manager Oracle Data Integrator for Big Data Alex Kotopoulis Senior Principal Product Manager Hands on Lab - Oracle Data Integrator for Big Data Abstract: This lab will highlight to Developers, DBAs and Architects

More information

How To Install Caarcserve Backup Patch Manager 27.3.2.2 (Carcserver) On A Pc Or Mac Or Mac (Or Mac)

How To Install Caarcserve Backup Patch Manager 27.3.2.2 (Carcserver) On A Pc Or Mac Or Mac (Or Mac) CA ARCserve Backup Patch Manager for Windows User Guide r16 This Documentation, which includes embedded help systems and electronically distributed materials, (hereinafter referred to as the Documentation

More information

CA ARCserve Backup for Windows

CA ARCserve Backup for Windows CA ARCserve Backup for Windows Agent for Sybase Guide r16.5 This Documentation, which includes embedded help systems and electronically distributed materials, (hereinafter referred to as the Documentation

More information

LANDESK Service Desk. Desktop Manager

LANDESK Service Desk. Desktop Manager LANDESK Service Desk Desktop Manager LANDESK SERVICE DESK DESKTOP MANAGER GUIDE This document contains information, which is the confidential information and/or proprietary property of LANDESK Software,

More information

Event Manager. LANDesk Service Desk

Event Manager. LANDesk Service Desk Event Manager LANDesk Service Desk LANDESK SERVICE DESK EVENT MANAGER GUIDE This document contains information that is the proprietary and confidential property of LANDesk Software, Inc. and/or its affiliated

More information

Creating a universe on Hive with Hortonworks HDP 2.0

Creating a universe on Hive with Hortonworks HDP 2.0 Creating a universe on Hive with Hortonworks HDP 2.0 Learn how to create an SAP BusinessObjects Universe on top of Apache Hive 2 using the Hortonworks HDP 2.0 distribution Author(s): Company: Ajay Singh

More information

CA Cloud Service Delivery Platform

CA Cloud Service Delivery Platform CA Cloud Service Delivery Platform Customer Onboarding Version 01.0.00 This Documentation, which includes embedded help systems and electronically distributed materials, (hereinafter referred to as the

More information

Crystal Reports Installation Guide

Crystal Reports Installation Guide Crystal Reports Installation Guide Version XI Infor Global Solutions, Inc. Copyright 2006 Infor IP Holdings C.V. and/or its affiliates or licensors. All rights reserved. The Infor word and design marks

More information

Security Explorer 9.5. User Guide

Security Explorer 9.5. User Guide 2014 Dell Inc. ALL RIGHTS RESERVED. This guide contains proprietary information protected by copyright. The software described in this guide is furnished under a software license or nondisclosure agreement.

More information

WatchDox Administrator's Guide. Application Version 3.7.5

WatchDox Administrator's Guide. Application Version 3.7.5 Application Version 3.7.5 Confidentiality This document contains confidential material that is proprietary WatchDox. The information and ideas herein may not be disclosed to any unauthorized individuals

More information

CA arcserve Unified Data Protection Agent for Linux

CA arcserve Unified Data Protection Agent for Linux CA arcserve Unified Data Protection Agent for Linux User Guide Version 5.0 This Documentation, which includes embedded help systems and electronically distributed materials, (hereinafter referred to as

More information

Veritas Cluster Server Database Agent for Microsoft SQL Configuration Guide

Veritas Cluster Server Database Agent for Microsoft SQL Configuration Guide Veritas Cluster Server Database Agent for Microsoft SQL Configuration Guide Windows 2000, Windows Server 2003 5.0 11293743 Veritas Cluster Server Database Agent for Microsoft SQL Configuration Guide Copyright

More information

Upgrading from Call Center Reporting to Reporting for Contact Center. BCM Contact Center

Upgrading from Call Center Reporting to Reporting for Contact Center. BCM Contact Center Upgrading from Call Center Reporting to Reporting for Contact Center BCM Contact Center Document Number: NN40010-400 Document Status: Standard Document Version: 02.00 Date: June 2006 Copyright Nortel Networks

More information

SQL Server Integration Services with Oracle Database 10g

SQL Server Integration Services with Oracle Database 10g SQL Server Integration Services with Oracle Database 10g SQL Server Technical Article Published: May 2008 Applies To: SQL Server Summary: Microsoft SQL Server (both 32-bit and 64-bit) offers best-of breed

More information

Arcserve Cloud. Arcserve Cloud Getting Started Guide

Arcserve Cloud. Arcserve Cloud Getting Started Guide Arcserve Cloud Arcserve Cloud Getting Started Guide This Documentation, which includes embedded help systems and electronically distributed materials, (hereinafter referred to as the Documentation ) is

More information

CA Spectrum and CA Embedded Entitlements Manager

CA Spectrum and CA Embedded Entitlements Manager CA Spectrum and CA Embedded Entitlements Manager Integration Guide CA Spectrum Release 9.4 - CA Embedded Entitlements Manager This Documentation, which includes embedded help systems and electronically

More information

Legal Information Trademarks Licensing Disclaimer

Legal Information Trademarks Licensing Disclaimer Scribe Insight Tutorials www.scribesoft.com 10/1/2014 Legal Information 1996-2014 Scribe Software Corporation. All rights reserved. Complying with all applicable copyright laws is the responsibility of

More information

Nimsoft Monitor. dns_response Guide. v1.6 series

Nimsoft Monitor. dns_response Guide. v1.6 series Nimsoft Monitor dns_response Guide v1.6 series CA Nimsoft Monitor Copyright Notice This online help system (the "System") is for your informational purposes only and is subject to change or withdrawal

More information

CA Clarity Project & Portfolio Manager

CA Clarity Project & Portfolio Manager CA Clarity Project & Portfolio Manager Connector for CA Unicenter Service Desk & CA Software Change Manager for Distributed Product Guide v2.0.00 This documentation, which includes embedded help systems

More information

FOR WINDOWS FILE SERVERS

FOR WINDOWS FILE SERVERS Quest ChangeAuditor FOR WINDOWS FILE SERVERS 5.1 User Guide Copyright Quest Software, Inc. 2010. All rights reserved. This guide contains proprietary information protected by copyright. The software described

More information

CA Cloud Service Delivery Platform

CA Cloud Service Delivery Platform CA Cloud Service Delivery Platform Service Level Manager Version 01.0.00 This Documentation, which includes embedded help systems and electronically distributed materials, (hereinafter referred to as the

More information

CA Nimsoft Monitor Snap

CA Nimsoft Monitor Snap CA Nimsoft Monitor Snap Configuration Guide for IIS Server Monitoring iis v1.5 series Legal Notices This online help system (the "System") is for your informational purposes only and is subject to change

More information

Upgrade Guide. CA Application Delivery Analysis 10.1

Upgrade Guide. CA Application Delivery Analysis 10.1 Upgrade Guide CA Application Delivery Analysis 10.1 This Documentation, which includes embedded help systems and electronically distributed materials, (hereinafter referred to as the Documentation ) is

More information

CA Mobile Device Management. How to Create Custom-Signed CA MDM Client App

CA Mobile Device Management. How to Create Custom-Signed CA MDM Client App CA Mobile Device Management How to Create Custom-Signed CA MDM Client App This Documentation, which includes embedded help systems and electronically distributed materials, (hereinafter referred to as

More information

Legal Notes. Regarding Trademarks. 2012 KYOCERA Document Solutions Inc.

Legal Notes. Regarding Trademarks. 2012 KYOCERA Document Solutions Inc. Legal Notes Unauthorized reproduction of all or part of this guide is prohibited. The information in this guide is subject to change without notice. We cannot be held liable for any problems arising from

More information

Jet Data Manager 2012 User Guide

Jet Data Manager 2012 User Guide Jet Data Manager 2012 User Guide Welcome This documentation provides descriptions of the concepts and features of the Jet Data Manager and how to use with them. With the Jet Data Manager you can transform

More information

Visual Studio.NET Database Projects

Visual Studio.NET Database Projects Visual Studio.NET Database Projects CHAPTER 8 IN THIS CHAPTER Creating a Database Project 294 Database References 296 Scripts 297 Queries 312 293 294 Visual Studio.NET Database Projects The database project

More information

CA Spectrum. Microsoft MOM and SCOM Integration Guide. Release 9.4

CA Spectrum. Microsoft MOM and SCOM Integration Guide. Release 9.4 CA Spectrum Microsoft MOM and SCOM Integration Guide Release 9.4 This Documentation, which includes embedded help systems and electronically distributed materials, (hereinafter referred to as the Documentation

More information

Create a New Database in Access 2010

Create a New Database in Access 2010 Create a New Database in Access 2010 Table of Contents OVERVIEW... 1 CREATING A DATABASE... 1 ADDING TO A DATABASE... 2 CREATE A DATABASE BY USING A TEMPLATE... 2 CREATE A DATABASE WITHOUT USING A TEMPLATE...

More information

VERITAS Backup Exec TM 10.0 for Windows Servers

VERITAS Backup Exec TM 10.0 for Windows Servers VERITAS Backup Exec TM 10.0 for Windows Servers Quick Installation Guide N134418 July 2004 Disclaimer The information contained in this publication is subject to change without notice. VERITAS Software

More information

CA Nimsoft Unified Management Portal

CA Nimsoft Unified Management Portal CA Nimsoft Unified Management Portal HTTPS Implementation Guide 7.6 Document Revision History Document Version Date Changes 1.0 June 2014 Initial version for UMP 7.6. CA Nimsoft Monitor Copyright Notice

More information

CA NetQoS Performance Center

CA NetQoS Performance Center CA NetQoS Performance Center Install and Configure SSL for Windows Server 2008 Release 6.1 (and service packs) This Documentation, which includes embedded help systems and electronically distributed materials,

More information

ORACLE BUSINESS INTELLIGENCE WORKSHOP

ORACLE BUSINESS INTELLIGENCE WORKSHOP ORACLE BUSINESS INTELLIGENCE WORKSHOP Integration of Oracle BI Publisher with Oracle Business Intelligence Enterprise Edition Purpose This tutorial mainly covers how Oracle BI Publisher is integrated with

More information

SAP Business Objects Business Intelligence platform Document Version: 4.1 Support Package 7 2015-11-24. Data Federation Administration Tool Guide

SAP Business Objects Business Intelligence platform Document Version: 4.1 Support Package 7 2015-11-24. Data Federation Administration Tool Guide SAP Business Objects Business Intelligence platform Document Version: 4.1 Support Package 7 2015-11-24 Data Federation Administration Tool Guide Content 1 What's new in the.... 5 2 Introduction to administration

More information

Data Domain Profiling and Data Masking for Hadoop

Data Domain Profiling and Data Masking for Hadoop Data Domain Profiling and Data Masking for Hadoop 1993-2015 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or

More information

Universal Management Service 2015

Universal Management Service 2015 Universal Management Service 2015 UMS 2015 Help All rights reserved. No parts of this work may be reproduced in any form or by any means - graphic, electronic, or mechanical, including photocopying, recording,

More information

Implementing a SAS 9.3 Enterprise BI Server Deployment TS-811. in Microsoft Windows Operating Environments

Implementing a SAS 9.3 Enterprise BI Server Deployment TS-811. in Microsoft Windows Operating Environments Implementing a SAS 9.3 Enterprise BI Server Deployment TS-811 in Microsoft Windows Operating Environments Table of Contents Introduction... 1 Step 1: Create a SAS Software Depot..... 1 Step 2: Prepare

More information

Unicenter Patch Management

Unicenter Patch Management Unicenter Patch Management Best Practices for Managing Security Updates R11 This documentation (the Documentation ) and related computer software program (the Software ) (hereinafter collectively referred

More information

QAD Enterprise Applications. Training Guide Demand Management 6.1 Technical Training

QAD Enterprise Applications. Training Guide Demand Management 6.1 Technical Training QAD Enterprise Applications Training Guide Demand Management 6.1 Technical Training 70-3248-6.1 QAD Enterprise Applications February 2012 This document contains proprietary information that is protected

More information

BID2WIN Workshop. Advanced Report Writing

BID2WIN Workshop. Advanced Report Writing BID2WIN Workshop Advanced Report Writing Please Note: Please feel free to take this workbook home with you! Electronic copies of all lab documentation are available for download at http://www.bid2win.com/userconf/2011/labs/

More information

Utilities. 2003... ComCash

Utilities. 2003... ComCash Utilities ComCash Utilities All rights reserved. No parts of this work may be reproduced in any form or by any means - graphic, electronic, or mechanical, including photocopying, recording, taping, or

More information

CA Nimsoft Monitor. Probe Guide for CA ServiceDesk Gateway. casdgtw v2.4 series

CA Nimsoft Monitor. Probe Guide for CA ServiceDesk Gateway. casdgtw v2.4 series CA Nimsoft Monitor Probe Guide for CA ServiceDesk Gateway casdgtw v2.4 series Copyright Notice This online help system (the "System") is for your informational purposes only and is subject to change or

More information

Sage Intelligence Financial Reporting for Sage ERP X3 Version 6.5 Installation Guide

Sage Intelligence Financial Reporting for Sage ERP X3 Version 6.5 Installation Guide Sage Intelligence Financial Reporting for Sage ERP X3 Version 6.5 Installation Guide Table of Contents TABLE OF CONTENTS... 3 1.0 INTRODUCTION... 1 1.1 HOW TO USE THIS GUIDE... 1 1.2 TOPIC SUMMARY...

More information

Sage 300 ERP 2012. Sage CRM 7.1 Integration Guide

Sage 300 ERP 2012. Sage CRM 7.1 Integration Guide Sage 300 ERP 2012 Sage CRM 7.1 Integration Guide This is a publication of Sage Software, Inc. Version 2012 Copyright 2012. Sage Software, Inc. All rights reserved. Sage, the Sage logos, and the Sage product

More information

SQL Server An Overview

SQL Server An Overview SQL Server An Overview SQL Server Microsoft SQL Server is designed to work effectively in a number of environments: As a two-tier or multi-tier client/server database system As a desktop database system

More information

CA Clarity PPM. Connector for Microsoft SharePoint Product Guide. Service Pack 02.0.01

CA Clarity PPM. Connector for Microsoft SharePoint Product Guide. Service Pack 02.0.01 CA Clarity PPM Connector for Microsoft SharePoint Product Guide Service Pack 02.0.01 This Documentation, which includes embedded help systems and electronically distributed materials, (hereinafter referred

More information

Database Studio is the new tool to administrate SAP MaxDB database instances as of version 7.5.

Database Studio is the new tool to administrate SAP MaxDB database instances as of version 7.5. 1 2 3 4 Database Studio is the new tool to administrate SAP MaxDB database instances as of version 7.5. It replaces the previous tools Database Manager GUI and SQL Studio from SAP MaxDB version 7.7 onwards

More information

Matisse Installation Guide for MS Windows. 10th Edition

Matisse Installation Guide for MS Windows. 10th Edition Matisse Installation Guide for MS Windows 10th Edition April 2004 Matisse Installation Guide for MS Windows Copyright 1992 2004 Matisse Software Inc. All Rights Reserved. Matisse Software Inc. 433 Airport

More information

ODBC Driver Version 4 Manual

ODBC Driver Version 4 Manual ODBC Driver Version 4 Manual Revision Date 12/05/2007 HanDBase is a Registered Trademark of DDH Software, Inc. All information contained in this manual and all software applications mentioned in this manual

More information

NETWORK PRINT MONITOR User Guide

NETWORK PRINT MONITOR User Guide NETWORK PRINT MONITOR User Guide Legal Notes Unauthorized reproduction of all or part of this guide is prohibited. The information in this guide is subject to change without notice. We cannot be held liable

More information

CA Clarity Project & Portfolio Manager

CA Clarity Project & Portfolio Manager CA Clarity Project & Portfolio Manager Using CA Clarity PPM with Open Workbench and Microsoft Project v12.1.0 This documentation and any related computer software help programs (hereinafter referred to

More information

User Guide Release 3.5

User Guide Release 3.5 September 19, 2013 User Guide Release 3.5 User Guide Revision/Update Information: September 19, 2013 Software Version: PowerBroker Auditor for File System 3.5 Revision Number: 0 COPYRIGHT NOTICE Copyright

More information

Installing OneStop Reporting Products

Installing OneStop Reporting Products Installing OneStop Reporting Products Contents 1 Introduction 2 Product Overview 3 System Requirements 4 Deployment 5 Installation 6 Appendix 2010 OneStop Reporting http://www.onestopreporting.com support@onestopreporting.com

More information

CA Nimsoft Monitor. Probe Guide for E2E Application Response Monitoring. e2e_appmon v2.2 series

CA Nimsoft Monitor. Probe Guide for E2E Application Response Monitoring. e2e_appmon v2.2 series CA Nimsoft Monitor Probe Guide for E2E Application Response Monitoring e2e_appmon v2.2 series Copyright Notice This online help system (the "System") is for your informational purposes only and is subject

More information

Data Integrator Guide

Data Integrator Guide Data Integrator Guide Operations Center 5.0 March 3, 2014 Legal Notices THIS DOCUMENT AND THE SOFTWARE DESCRIBED IN THIS DOCUMENT ARE FURNISHED UNDER AND ARE SUBJECT TO THE TERMS OF A LICENSE AGREEMENT

More information

email-lead Grabber Business 2010 User Guide

email-lead Grabber Business 2010 User Guide email-lead Grabber Business 2010 User Guide Copyright and Trademark Information in this documentation is subject to change without notice. The software described in this manual is furnished under a license

More information

Installing the BlackBerry Enterprise Server Management Software on an administrator or remote computer

Installing the BlackBerry Enterprise Server Management Software on an administrator or remote computer Installing the BlackBerry Enterprise Server Management Software on an administrator or Introduction Some administrators want to install their administrative tools on their own Windows 2000 computer. This

More information

Secure Agent Quick Start for Windows

Secure Agent Quick Start for Windows Secure Agent Quick Start for Windows 1993-2015 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise)

More information

Moving the Web Security Log Database

Moving the Web Security Log Database Moving the Web Security Log Database Topic 50530 Web Security Solutions Version 7.7.x, 7.8.x Updated 22-Oct-2013 Version 7.8 introduces support for the Web Security Log Database on Microsoft SQL Server

More information

INTEGRATING MICROSOFT DYNAMICS CRM WITH SIMEGO DS3

INTEGRATING MICROSOFT DYNAMICS CRM WITH SIMEGO DS3 INTEGRATING MICROSOFT DYNAMICS CRM WITH SIMEGO DS3 Often the most compelling way to introduce yourself to a software product is to try deliver value as soon as possible. Simego DS3 is designed to get you

More information

Virtual Data Centre. User Guide

Virtual Data Centre. User Guide Virtual Data Centre User Guide 2 P age Table of Contents Getting Started with vcloud Director... 8 1. Understanding vcloud Director... 8 2. Log In to the Web Console... 9 3. Using vcloud Director... 10

More information

MAS 500 Intelligence Tips and Tricks Booklet Vol. 1

MAS 500 Intelligence Tips and Tricks Booklet Vol. 1 MAS 500 Intelligence Tips and Tricks Booklet Vol. 1 1 Contents Accessing the Sage MAS Intelligence Reports... 3 Copying, Pasting and Renaming Reports... 4 To create a new report from an existing report...

More information

SilkTest Workbench. Getting Started with.net Scripts

SilkTest Workbench. Getting Started with.net Scripts SilkTest Workbench Getting Started with.net Scripts Borland Software Corporation 4 Hutton Centre Dr., Suite 900 Santa Ana, CA 92707 Copyright 2010 Micro Focus (IP) Limited. All Rights Reserved. SilkTest

More information

SAP Data Services 4.X. An Enterprise Information management Solution

SAP Data Services 4.X. An Enterprise Information management Solution SAP Data Services 4.X An Enterprise Information management Solution Table of Contents I. SAP Data Services 4.X... 3 Highlights Training Objectives Audience Pre Requisites Keys to Success Certification

More information

How To Use Query Console

How To Use Query Console Query Console User Guide 1 MarkLogic 8 February, 2015 Last Revised: 8.0-1, February, 2015 Copyright 2015 MarkLogic Corporation. All rights reserved. Table of Contents Table of Contents Query Console User

More information

The cloud server setup program installs the cloud server application, Apache Tomcat, Java Runtime Environment, and PostgreSQL.

The cloud server setup program installs the cloud server application, Apache Tomcat, Java Runtime Environment, and PostgreSQL. GO-Global Cloud 4.1 QUICK START SETTING UP A WINDOWS CLOUD SERVER AND HOST This guide provides instructions for setting up a cloud server and configuring a host so it can be accessed from the cloud server.

More information

Accounts Payable Workflow Guide. Version 11.2

Accounts Payable Workflow Guide. Version 11.2 Accounts Payable Workflow Guide Version 11.2 Copyright Information Copyright 2013 Informa Software. All Rights Reserved. No part of this publication may be reproduced, transmitted, transcribed, stored

More information

CA Change Manager Enterprise Workbench r12

CA Change Manager Enterprise Workbench r12 CA Change Manager Enterprise Workbench r12 Database Support for Microsoft SQL Server 2008 This documentation and any related computer software help programs (hereinafter referred to as the "Documentation")

More information

Working with SQL Server Integration Services

Working with SQL Server Integration Services SQL Server Integration Services (SSIS) is a set of tools that let you transfer data to and from SQL Server 2005. In this lab, you ll work with the SQL Server Business Intelligence Development Studio to

More information

Scheduling in SAS 9.4 Second Edition

Scheduling in SAS 9.4 Second Edition Scheduling in SAS 9.4 Second Edition SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2015. Scheduling in SAS 9.4, Second Edition. Cary, NC: SAS Institute

More information

SAS Visual Analytics 7.2 for SAS Cloud: Quick-Start Guide

SAS Visual Analytics 7.2 for SAS Cloud: Quick-Start Guide SAS Visual Analytics 7.2 for SAS Cloud: Quick-Start Guide Introduction This quick-start guide covers tasks that account administrators need to perform to set up SAS Visual Statistics and SAS Visual Analytics

More information

VERITAS NetBackup 6.0

VERITAS NetBackup 6.0 VERITAS NetBackup 6.0 Backup, Archive, and Restore Getting Started Guide for UNIX, Windows, and Linux N15278C September 2005 Disclaimer The information contained in this publication is subject to change

More information

Using SQL Reporting Services with Amicus

Using SQL Reporting Services with Amicus Using SQL Reporting Services with Amicus Applies to: Amicus Attorney Premium Edition 2011 SP1 Amicus Premium Billing 2011 Contents About SQL Server Reporting Services...2 What you need 2 Setting up SQL

More information

CA Nimsoft Monitor. Probe Guide for Active Directory Response. ad_response v1.6 series

CA Nimsoft Monitor. Probe Guide for Active Directory Response. ad_response v1.6 series CA Nimsoft Monitor Probe Guide for Active Directory Response ad_response v1.6 series Legal Notices This online help system (the "System") is for your informational purposes only and is subject to change

More information

Tips and Tricks SAGE ACCPAC INTELLIGENCE

Tips and Tricks SAGE ACCPAC INTELLIGENCE Tips and Tricks SAGE ACCPAC INTELLIGENCE 1 Table of Contents Auto e-mailing reports... 4 Automatically Running Macros... 7 Creating new Macros from Excel... 8 Compact Metadata Functionality... 9 Copying,

More information

13 Managing Devices. Your computer is an assembly of many components from different manufacturers. LESSON OBJECTIVES

13 Managing Devices. Your computer is an assembly of many components from different manufacturers. LESSON OBJECTIVES LESSON 13 Managing Devices OBJECTIVES After completing this lesson, you will be able to: 1. Open System Properties. 2. Use Device Manager. 3. Understand hardware profiles. 4. Set performance options. Estimated

More information

MICROSOFT OFFICE ACCESS 2007 - NEW FEATURES

MICROSOFT OFFICE ACCESS 2007 - NEW FEATURES MICROSOFT OFFICE 2007 MICROSOFT OFFICE ACCESS 2007 - NEW FEATURES Exploring Access Creating and Working with Tables Finding and Filtering Data Working with Queries and Recordsets Working with Forms Working

More information

CA XOsoft Replication for Windows

CA XOsoft Replication for Windows CA XOsoft Replication for Windows Microsoft SQL Server Operation Guide r12.5 This documentation and any related computer software help programs (hereinafter referred to as the Documentation ) is for the

More information

Mobile Time Manager. Release 1.2.1

Mobile Time Manager. Release 1.2.1 Mobile Time Manager Release 1.2.1 This Documentation, which includes embedded help systems and electronically distributed materials, (hereinafter referred to as the Documentation ) is for your informational

More information

Portions of this product were created using LEADTOOLS 1991-2009 LEAD Technologies, Inc. ALL RIGHTS RESERVED.

Portions of this product were created using LEADTOOLS 1991-2009 LEAD Technologies, Inc. ALL RIGHTS RESERVED. Installation Guide Lenel OnGuard 2009 Installation Guide, product version 6.3. This guide is item number DOC-110, revision 1.038, May 2009 Copyright 1992-2009 Lenel Systems International, Inc. Information

More information

Version 3.8. Installation Guide

Version 3.8. Installation Guide Version 3.8 Installation Guide Copyright 2007 Jetro Platforms, Ltd. All rights reserved. This document is being furnished by Jetro Platforms for information purposes only to licensed users of the Jetro

More information