Talend Component tgoogleanalyticsmanagement

Similar documents
Cloud Backup for Joomla

Copyright Pivotal Software Inc, of 10

Smart Card Authentication. Administrator's Guide

Commerce Services Documentation

Data Mailbox. support.ewon.biz. Reference Guide

Smart Card Authentication Client. Administrator's Guide

Analytics Canvas Tutorial: Cleaning Website Referral Traffic Data. N m o d a l S o l u t i o n s I n c. A l l R i g h t s R e s e r v e d

This document describes the capabilities of NEXT Analytics v5.1 to retrieve data from Google Analytics directly into your spreadsheet file.

Qlik REST Connector Installation and User Guide

Cloud Elements! Marketing Hub Provisioning and Usage Guide!

PLSAP CONNECTOR FOR TALEND USER MANUAL

SafeNet KMIP and Google Cloud Storage Integration Guide

Google Analytics and Google Analytics Premium: limits and quotas

OAuth 2.0 Developers Guide. Ping Identity, Inc th Street, Suite 100, Denver, CO

How To Use Kiteworks On A Microsoft Webmail Account On A Pc Or Macbook Or Ipad (For A Webmail Password) On A Webcomposer (For An Ipad) On An Ipa Or Ipa (For

The full setup includes the server itself, the server control panel, Firebird Database Server, and three sample applications with source code.

CRM Migration Manager for Microsoft Dynamics CRM. User Guide

Mass Announcement Service Operation

INTEGRATING MICROSOFT DYNAMICS CRM WITH SIMEGO DS3

SnapLogic Salesforce Snap Reference

REST Webservices API Reference Manual

RCS Liferay Google Analytics Portlet Installation Guide

Traitware Authentication Service Integration Document

TIBCO Spotfire Metrics Modeler User s Guide. Software Release 6.0 November 2013

Oracle Database 11g: SQL Tuning Workshop

Merchant Reporting Tool

Contents. 2 Alfresco API Version 1.0

Manage Workflows. Workflows and Workflow Actions

Oracle Siebel Marketing and Oracle B2B Cross- Channel Marketing Integration Guide ORACLE WHITE PAPER AUGUST 2014

Technical Support Set-up Procedure

Google Docs Print. Administrator's Guide

USER GUIDE. PowerMailChimp CRM 2011

Setting up single signon with Zendesk Remote Authentication

SAP InfiniteInsight Explorer Analytical Data Management v7.0

Kofax Export Connector for Microsoft SharePoint

Coveo Platform 7.0. Salesforce Connector Guide

Index. AdWords, 182 AJAX Cart, 129 Attribution, 174

Programming Autodesk PLM 360 Using REST. Doug Redmond Software Engineer, Autodesk

Enterprise Service Bus

Big Data Operations Guide for Cloudera Manager v5.x Hadoop

How to select the right Marketing Cloud Edition

How to Ingest Data into Google BigQuery using Talend for Big Data. A Technical Solution Paper from Saama Technologies, Inc.

Resco CRM Server Guide. How to integrate Resco CRM with other back-end systems using web services

Cloud Elements ecommerce Hub Provisioning Guide API Version 2.0 BETA

System Administrator Training Guide. Reliance Communications, Inc. 603 Mission Street Santa Cruz, CA

Getting Started with the Ed-Fi ODS and Ed-Fi ODS API

Copyright 2014 Jaspersoft Corporation. All rights reserved. Printed in the U.S.A. Jaspersoft, the Jaspersoft

Oracle BI Cloud Service : What is it and Where Will it be Useful? Francesco Tisiot, Principal Consultant, Rittman Mead OUG Ireland 2015, Dublin

Salesforce Classic Guide for iphone

Integrated Billing Solutions with HP CSA 4.00

vcloud Air Platform Programmer's Guide

Oracle Database 11g: SQL Tuning Workshop Release 2

Integration Overview. Web Services and Single Sign On

A Step by Step Guide on Integrating Data in

Portals and Hosted Files

Test Automation Integration with Test Management QAComplete

Introduction to the SIF 3.0 Infrastructure: An Environment for Educational Data Exchange

Merchant Interface Online Help Files

Building Views and Charts in Requests Introduction to Answers views and charts Creating and editing charts Performing common view tasks

Yale Secure File Transfer User Guide

REST Webservices API Tutorial

Salesforce Opportunities Portlet Documentation v2

Salesforce Files Connect Implementation Guide

Security and ArcGIS Web Development. Heather Gonzago and Jeremy Bartley

Software Requirements Specification for DLS SYSTEM

User Guide. Version R91. English

Talend Open Studio for MDM. Getting Started Guide 6.0.0

Opacus Outlook Addin v3.x User Guide

Release Notes for Patch Release #2614

Talend Component: tjasperreportexec

OpenText Information Hub (ihub) 3.1 and 3.1.1

Egnyte Single Sign-On (SSO) Installation for OneLogin

Dell One Identity Cloud Access Manager How to Develop OpenID Connect Apps

Fairsail REST API: Guide for Developers

IBM API Management Overview IBM Corporation

Installation & Configuration Guide User Provisioning Service 2.0

Sage Intelligence Financial Reporting for Sage ERP X3 Version 6.5 Installation Guide

Integrating VoltDB with Hadoop

API Guide v /11/2013

Software Engineering I CS524 Professor Dr. Liang Sheldon X. Liang

RS MDM. Integration Guide. Riversand

!!!!!!!! Startup Guide. Version 2.7

CommonSpot Content Server Version 6.2 Release Notes

11 Aug 2014 Beta Draft

FmPro Migrator - FileMaker to SQL Server

Office365Mon Developer API

Adeptia Suite 6.2. Application Services Guide. Release Date October 16, 2014

KMx Enterprise: Integration Overview for Member Account Synchronization and Single Signon

Forumbee Single Sign- On

I) Add support for OAuth in CAS server

API documentation - 1 -

Configuring Your Gateman Server

1 Which of the following questions can be answered using the goal flow report?

Cloud Elements! Events Management BETA! API Version 2.0

SyncThru Database Migration

Work with PassKey Manager

Using RADIUS Agent for Transparent User Identification

Big Data Analytics in LinkedIn. Danielle Aring & William Merritt

Sisense. Product Highlights.

Getting Started with the icontact API

Transcription:

Talend Component tgoogleanalyticsmanagement Purpose This component addresses the needs of gathering Google Analytics data for a large number of profiles and fine-grained detail data. To prepare the queries you need to know what account, profiles and so on you have access. This component collects all elements of the Google analytics model and provides their data as separate output flows. The component uses the Core Reporting API 3.0 and the Authentication API OAuth 2.0 final. To provide the ability to run in multiple iterations the component has special capabilities to avoid multiple logins through iterations. Usually automated processes should not use personal accounts. This requirement is addressed by using Service Accounts, which are the only preferred way to login into Google Analytics for automated processes. Account Web property Profile An account contains one or more web properties. A web property contains one or more profiles. Segment Goal A segment is independent; a goal always belongs to a profile. Additional the component provides an output flow for un-sampled reports. But this requires a premium account. It is recommended to use the flow from the component tgoogleanalyticsunsampledreports. Talend-Integration This component can be found in the palette under Business->Google Parameters Parameters to connect to Google Analytics (setup client) Property Content Data types Application Name Use existing client Authentication Method Not necessary, but recommended by Google. Simple provide the name of your application gathering data. Required Choose here the tgoogledrive component which client do you want to reuse in this component instance. Choose the method to authenticate: Service Account or Client-ID for native applications Boolean Properties to use the Service Account Property Content Data types Service Account Email The email address of the service account. Google creates this address within the process of creating a service account. Only for service accounts! Required Key File (p12) The Service Account Login works with private key file for authentication. In the process of creating a service account you download this file. Properties to use the Client-ID for native applications Property Content Data types User Account Email Email of the user account or the Client-ID Client secret file (json) This json file downloaded for the Client-ID The usage of the Client-ID for native applications expects on the first run an user interaction with the Google web page and after finishing the form to approve the access right you need to close the browser to let the component continue, otherwise the authentication process will not complete.

Miscellaneous Properties Property Content Data types Ignore permission errors while collecting user permissions If some if the profiles, web properties or accounts are not accessible with Manage User rights, with this option the component ignores the problem and continues with the next request. Boolean Example: How to configure an account to allow collecting user permission information. Advanced Option Parameters Property Timeout in s Static Time Offset (to past) Reuse Client for Iterations Distinct Name Extension Content How long should the component wait for getting the first result and fetching all result with one internal iteration Within the process of login, the component requests an access token and use the local time stamp (because these tokens will expire after a couple of seconds) Google rejects all requests to access tokens when the request is in the future compared to the timestamp in Google servers. If you experience such kind of problems, this options let the requests appear to be more in the past (5-10s was recognized as good time shift) If you use this component in iterations it is strongly recommended to set this option. It saves time to authenticate unnecessary often and avoids problems because of maximum amount of connects per time range. The client will be kept with an automatically created name: Talend-Name-Component name + job name. In case this is not distinct enough, you can specify an additional extension to the name. Return values Return value ERROR_MESSAGE ACCOUNTS_NB_LINE WEBPROPERTIES_NB_LINE PROFILES_NB_LINE SEGMENTS_NB_LINE GOALS_NB_LINE GOAL_URL_DEST_STEPS GOAL_EVENT_CONDITIONS REPORT_COLUMNS_NB_LINE ACCOUNT_USER_PERMISSIONS_NB_LINE Content Last error message Number of accounts Number of web properties. Number of profiles. Number of segments Number of goals Number of URL destination steps belongs to goals Number of event conditions belongs to goals Number of column (dimensions and metrics) metadata records Number users have access the accounts WEBPROPERTY_USER_PERMISSIONS_NB_LINE Number users have access to the web properties PROFILE_USER_PERMISSIONS_NB_LINE CUSTOM_DATA_SOURCES_NB_LINE Number users have access to the profiles (views) Number of Custom Data Sources

Output flows The output flows have fixed schemas. The schema columns are completely defined with key, nullable and length. Because of the well-defined schemas, the flows can direct connected to database output components with the option to create the tables. Output flow: Accounts ACCOUNT_ID Long Primary key ACCOUNT_NAME Name ACCOUNT_CREATED Date Created at ACCOUNT_UPDATED Date Updated at ACCOUNT_SELFLINK Url to manage the account Output flow: Web properties WEBPROPERTY_ID Primary key, This is the key you use within your web pages to track them. WEBPROPERTY_INTERNAL_ID Long Internal ID WEBPROPERTY_NAME Name WEBPROPERTY_SITE_URL URL of the tracked site WEBPROPERTY_CREATED Date Created at WEBPROPERTY_UPDATED Date Updated at WEBPROPERTY_SELFLINK Url to manage the web property WEBPROPERTY_LEVEL Level for this web property. Possible values are STANDARD or PREMIUM WEBPROPERTY_INDUSTRY_VERTICAL Output flow: Profiles The industry vertical/category selected for this web property WEBPROPERTY_ID Primary key, This is the key you use within your web pages to track them. WEBPROPERTY_INTERNAL_ID Long Internal ID PROFILE_NAME Name PROFILE_DEFAULT_PAGE PROFILE_EXCLUDE_QUERY_PARAMS PROFILE_SITE_SEARCH_CATEGORY_PARAMS PROFILE_CURRENCY Currency in 3 digits PROFILE_TIMEZONE Time zone (long description) PROFILE_CREATED Date Created at PROFILE_UPDATED Date Updated at

PROFILE_SELFLINK URL to manage the profile PROFILE_ECOMMERCE_TRACKING PROFILE_STRIP_SITE_SEARCH_QUERY_PARAMS PROFILE_STRIP_SITE_SEARCH_CATEGORY_PARAMS Output flow: Segments Boolean Indicates whether ecommerce tracking is enabled for this view (profile) Boolean Whether or not Analytics will strip search query parameters from the URLs in your reports. Boolean Whether or not Analytics will strip search category parameters from the URLs in your reports. SEGMENT_ID Primary key (only unique within your account) SEGMENT_NAME Name SEGMENT_DEFINITION The filter definition SEGMENT_CREATED Date Created at, can be null! SEGMENT_UPDATED Date Updated at, can be null! Output flow: Goals GOAL_ID Integer ID of the goal, the ID is only unique with the profile WEBPROPERTY_ID Primary key, This is the key you use within your web pages to track them. WEBPROPERTY_INTERNAL_ID Long Internal ID GOAL_NAME Name GOAL_CREATED GOAL_UPDATED GOAL_ACTIVE Boolean GOAL_TYPE Goal type. Possible values are URL_DESTINATION, VISIT_TIME_ON_SITE, VISIT_NUM_PAGES, and EVENT. GOAL_VALUE VISIT_TIME_ON_SITE_DETAILS_COMP_TYPE Type of comparison. Possible values are LESS_THAN or GREATER_THAN. Long VISIT_TIME_ON_SITE_DETAILS_COMP_VALUE Long VISIT_NUM_PAGES_DETAILS_COMP_TYPE VISIT_NUM_PAGES_DETAILS_COMP_VALUE URL_DEST_DETAILS_URL URL of the destination (can be relative) URL_DEST_DETAILS_CASE_SENSITIVE Boolean Whether or not the URL is case sensitive URL_DEST_DETAILS_MATCH_TYPE Method to match the URL (e.g. REGEX) Long URL_DEST_DETAILS_FIRST_STEP_REQUIERED Boolean If the first step is required. See next flow Goal

Url Destination Steps GOAL_SELFLINK URL to the Google server to manage the goal Output flow: Goal Url Destination Steps WEBPROPERTY_ID Primary key, This is the key you use within your web pages to track them. WEBPROPERTY_INTERNAL_ID Long Internal ID URL_DEST_STEP_INDEX Integer Index of the step within the goal URL_DEST_STEP_NAME Name of the step URL_DEST_STEP_NUMBER Integer URL_DEST_STEP_URL URL to reach this step Output flow: Goal Event Conditions WEBPROPERTY_ID WEBPROPERTY_INTERNAL_ID Long EVENT_CONDITION_INDEX Integer Index of the condition within the goal EVENT_CONDITION_COMP_TYPE Condition type EVENT_CONDITION_COMP_VALUE Long Value to evaluate the condition EVENT_CONDITION_EXPRESSION Expression to evaluate the condition EVENT_CONDITION_MATCH_TYPE Method to check the condition (e.g. REGEX) EVENT_CONDITION_TYPE Output flow: Dimensions and Metrics COL_TYPE Type of the column METRIC or DIMENSION COL_API_NAME Identifier for the usage in the API requests COL_UI_NAME Human readable name in the dashboard COL_DESCRIPTION Description of the dimension or metric COL_DATA_TYPE The data type in capital letters COL_GROUP The category of the column COL_STATUS The status like PUBLIC=use it, DEPRECATED=using the replacement soon COL_REPLACED_BY The replacement column for deprecated columns COL_CALCULATION For metrics the calculation formula if the metric is the result of a calculation, otherwise empty for direct measured metrics.

Output flow: User Permissions for Accounts EMAIL User email address (also in case of using service accounts) PERMISSIONS_LOCAL Comma separated list of the permissions explicit set for the account PERMISSIONS_EFFECTIVE Comma separated list of the permissions implicit given for the account Output flow: User Permissions for Web Properties WEBPROPERTY_ID ID of the web property EMAIL User email address (also in case of using service accounts) PERMISSIONS_LOCAL Comma separated list of the permissions explicit set for the web property PERMISSIONS_EFFECTIVE Comma separated list of the permissions implicit given for the web property Output flow: User Permissions for Views WEBPROPERTY_ID ID of the web property EMAIL User email address (also in case of using service accounts) PERMISSIONS_LOCAL Comma separated list of the permissions explicit set for the view PERMISSIONS_EFFECTIVE Comma separated list of the permissions implicit given for the view Output flow: Custom Data Sources (CDS) CDS_ID Long ID of the custom data source WEBPROPERTY_ID ID of the web property CDS_NAME Name CDS_TYPE Type like COST or REFUND etc. CDS_IMPORT_BEHAVIOR How to handle uploads. Example: OVERWRITE CDS_LINKED_PROFILES Comma separated list of the linked views CDS_CREATED Date Date and time when the data source has been created CDS_UPDATED Date Date and time when the data source has been updated

Scenario: Fetch all meta-data In this scenario you fetch all data from accounts, web properties, profiles, segments, goals and goal-url-destination-steps and goal-event-conditions, unsampled reports and 3 times user permissions and save them into separate database tables. All database output components get their schema from the flow. and here the component basic settings: To create this scenario you have to do: 1. Drop the tgoogleanalyticsmanagement component and 12 database output components into the job. 2. Give all 12 database output components its table names and configure them to Create table if not exists 3. Than create the 12 flows. 4. Sync all schemas of the database components. That s it. Advise to install and use the component Please ensure, that your service account email address is added as user to all relevant profiles! You will only retrieve account, web properties, profiles where your service account email is added (to the profiles, that enough). The best way is to download and install it from Talend Exchange. In case of missing libraries: 1. Close Studio 2. Delete the file: %Studio install dir%/configuration/componentscache.javacache 3. Start Studio In the Exchange panel you can update or delete your downloaded components.

Next a real live scenario: In this scenario the job recognises when some changes in the metadata happens using SCD logic. The job starts with collecting all metadata into the memory and next starts writing the data within a transaction. Collecting the metadata. Open a database connection and start writing the metadata for accounts, profiles, web properties and segments Continued next page

Continue writing goals, goal URL destinations and goal event conditions. And finally the metadata about dimensions and metrics and commit the transaction