THE DATA HANDBOOK Data Architecture for Salesforce Marketing Cloud Eliot Harper
Copyright 2018 Eliot Harper Published by AttributeValue Pty Limited, Melbourne, Australia. All rights reserved. Book and cover design by Scott Citron The publisher has taken care in the preparation of this publication but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information contained herein. About the Author Eliot Harper is a Salesforce MVP and an acknowledged expert in Salesforce Marketing Cloud. Eliot has consulted for some of the largest brands in the world and has written several books on Marketing Cloud.
Contents Introduction 1 Data Models 3 All Subscribers List... 3 Lists... 5 Data Extensions......................................... 6 Contacts... 9 Data Transformation and Segmentation 11 Marketing Cloud Connect 14 Conclusion 17 -iii-
Introduction Salesforce Marketing Cloud is a digital marketing platform with an application suite that enables highly personalized and automated communication across numerous channels and devices. It allows marketers to deliver the right message, to the right people, at the right time. But eective marketing starts with data. The eicient delivery of relevant communication requires a well-considered and optimized data model, which in turn provides agility across marketing campaigns and journeys. This enables you to respond to changing customer and market demands, while also providing moreaccessible and powerful analytics. It s also important to consider the appropriate use of Marketing Cloud. Constraints typically arise when the platform is used as a substitute for enterprise applications; for example, in the storage and transformation of voluminous and complex datasets (or big data). While Marketing Cloud is an enterprise-grade platform that automates and executes campaigns and journeys at scale, it should be used as one component within a broader technology stack, where other suitable platforms are used for data-related processes such as warehousing, transformation, auditing, analysis and visualization. -1-
This handbook explains the fundamental concepts that apply when working with data in Marketing Cloud. It also spells out the considerations and best practices for building a well-designed data model, to ensure optimal performance and accessibility of data across all of your marketing activities. -2-
Data Models Marketing Cloud uses a number of data models for storing data, or attributes. It s important to understand the purpose of each model to determine its appropriate use. All Subscribers List The All Subscribers list is a master list of all people, or Subscribers, who can receive email communications from your organization. When an email is sent from the platform and the Subscriber does not exist in the All Subscribers list, they are appended to it as a record. Each Marketing Cloud account includes an All Subscribers list, which can optionally be segmented at a Business Unit level by applying Subscriber Filters. Each Subscriber within the All Subscribers list is represented by a unique, user-defined identifier referred to as a Subscriber Key. The All Subscribers list has two main purposes: 1. It identifies the status of the Subscriber, whether they have unsubscribed from receiving mails, or emails to their address are undeliverable (held) or have been returned (bounced). 2. It can optionally be used to store data related to a Subscriber, through profile attributes. -3-
Considerations Limited data types are available when storing Subscriber data as profile attributes; specifically, text, number or date types. Profile attributes are defined globally and apply to all Subscriber records in the account. They cannot be defined at a list or group level. Profile attributes can only exist as a one-to-one relationship for a Subscriber; for example, you can t use profile attributes to store relational data, such as purchase history. Best Practices The Subscriber Key is a system of record and should use a unique identifier that persists over the Subscriber s lifetime; for example, a customer number. You should not use an email address as a Subscriber Key value. Subscriber Filters should not be used to segment a large number of records by Business Unit, particularly when multiple groups and conditions are applied to filter criteria, as this will directly impact performance. f f Profile attributes are stored by the platform as a string type, irrespective of the data types assigned to them. The platform casts them to their assigned type (for example, date) when required for conditional evaluation or filtering, but in turn this can impact performance. You should consider storing nontext profile attributes (for example, date, number, Boolean or decimal values) in a Data Extension. -4-
Lists A list is a collection of Subscribers that belong to the All Subscribers list. Lists provide a simple and convenient method for segmenting your audience and identifying who should receive specific email communications; for example, weekly newsletter or loyalty members. Lists are used to store a Subscriber s subscription status for a given list. Lists can also be used to further segment your audience into filtered or random groups. Considerations Subscribers are imported into lists at an average rate of 500,000 records per hour, as List Detective operations (a process of washing Subscriber email addresses against a database of known bad email addresses) occur at the time of import. Lists can only contain Subscriber records, not other datasets. There is no overwrite option available for updating lists. You can only add or update Subscriber records in a list. If a Subscriber unsubscribes from the All Subscribers list, they are also unsubscribed from all lists and groups. If a Subscriber s status changes back to active on the All Subscribers list, they are not resubscribed to lists or groups. f f Subscribers who unsubscribe from all lists will still have an active subscription status on the All Subscribers list. -5-
Lists cannot be shared across Business Units. They are only available within the Business Unit that they are created in. Best Practices Choose lists if you require simplicity in your data model over performance. Lists should not be used to segment a large number of Subscribers; for example, greater than 500,000 records. Data Extensions A Data Extension is a relational database table that can be used to store any Subscriber-related data, including: 1. email Subscriber data (similar to lists), referred to as a Sendable Data Extension 2. relational data for a Subscriber; for example, order history 3. form data submitted from a landing page 4. Salesforce Object records. Data Extension fields (or attributes) are stored in a Microsoft SQL (Structured Query Language) database. They oer more flexibility and extensibility than lists as they provide a relational data model and support a greater number of data types. Data Extensions provide permission-based access, where policies can be assigned to fields and deletion. Data Extensions can also be selectively shared across Business Units. -6-
A list-based subscription model is also available for Data Extensions through Publication Lists, which enable you to manage unsubscribes at a Data Extension level. Considerations Data Extensions use Transact-SQL, which has a row limit. Columns that don t fit within the limit are placed o-row, in a separate internal table. As a result, the platform has to execute multiple queries in order to return a single row for Data Extension records that exceed this row limit. As a guide, you should limit the number of fields in a Data Extension to no more than 50 and field lengths to 100 characters or less. Referential integrity (the process of validating the relationships between Data Extension fields) is not enforced by the platform. As a result, a record can be deleted from one Data Extension even though its Primary Key is used as a reference in another Data Extension. AMPscript functions that are used to update, insert or delete data from Data Extensions are executed in a single call, after all emails have been sent. If the email is cancelled during send (for example, the RaiseError function is used), then no Data Extension records will be modified. Best Practices Lighten the platform load by only importing the data that you need for your marketing communications. f f Apply a Data Retention policy to Data Extensions that don t require persistent storage. For example, if you use a Data -7-
Extension to store all customer orders, but only require the last six months of data to segment customers based on their order history, then apply a retention setting that deletes individual Data Extension records older than six months. Avoid situations where simultaneous processes perform updates or insert operations on the same Data Extension. For example, if a landing page is used to update a Data Extension row from a lead capture form, you should ensure that other platform operations are not performed on the same Data Extension at the same time. Assign Primary Keys or Composite Keys (two or more Primary Keys) to non-nullable fields that contain a unique value for each record. The platform will build an index for key-based Data Extensions, which will benefit performance. Assigning many Composite Keys to Data Extension fields can be detrimental when done too aggressively, as the generated index can increase data complexity. You should limit Composite Keys to critical fields. When audience level opt-ins are required, select a Publication List at send time to manage opt-ins for the Data Extension. Assign appropriate data types to Data Extension fields. While data can be cast to a dierent type through queries or scripting, this adds complexity and may impact performance. f f Assign an appropriate character length to text fields. For example, if you know that a product code will always be limited to 10 characters, then set the field length to 10 characters. Longer fields impose an overhead that isn t always necessary. -8-
Contacts Marketing Cloud includes a Contact model that enables data relationships to be consolidated, organized and linked to a person, or Contact. This model, available through Contact Builder, provides a single customer view of the engagement metrics, subscriptions and attributes related to a Contact. Unlike the All Subscribers list, the Contact model is not based on a specific channel. A Contact record includes two primary keys: a Contact ID, which is a system-defined identifier, and a Contact Key, which is a user-defined identifier and serves as the system of record for a Contact. Contact records are derived from a Population, which provides a master set of Contacts. In much the same way that Subscriber records are created in the All Subscribers list, a Population is also used to associate an email address and Contact Key with a Contact when a Contact record is created. Considerations If you use numeric identifiers to classify Contacts, the data type will need to be set as a text type in order to create a relationship with the Contact Key of the Contact record. A Contact model is specific to each Business Unit and cannot be shared across Business Units. f f A Contact, from a billing and usage perspective, includes All Contacts in the Contact model and also any individual who has received a message from any channel, even if they do not appear in the All Contacts list. -9-
Best Practices When creating relationships between Attribute Sets in Data Designer, link them using Primary Keys. The platform creates an index based on these relationships, which provides performance benefits. Contact Key values must match the Subscriber Key values used in Email Studio. The number of Contacts is determined by and de-duplicated against the Contact Key value, so ensure that you use a consistent Contact Key for all Contacts in your account. The Contact ID is a surrogate key used by the platform and should not be used for creating relationships with the Contact record. f f Populations represent a master group of Contacts and should be limited to no more than three within a Business Unit, to avoid performance issues. -10-
Data Transformation and Segmentation Marketing Cloud includes Automation Studio, which automates repetitive processes through a multi-step workflow. Applications include: 1. performing Extract, Transform and Load (ETL) processes to import file-based data from external systems 2. automating recurring marketing activities; for example, sending monthly email newsletters to a segmented audience 3. using filters and queries to create targeted audiences, then using these audiences as entry sources for journeys. Considerations While Marketing Cloud enables ETL processing of data from external systems, you should only import the data you need for your marketing campaigns and journeys. The platform is not designed to be a replacement for a data warehouse, which uses hardware, software and resources that are specifically optimized for business intelligence and data analysis operations. f f Consider using other tools if you require complex segmentation, prioritization and data orchestration, then import this data into Marketing Cloud for use in journeys and campaigns, or further segmentation. -11-
Automation Studio Activities must complete within 30 minutes or they will timeout. This consideration is particularly applicable when using SQL queries in Query Activities, as complex queries or large record sets can take a considerable amount of time to execute. Queries that include both a JOIN clause and a SELECT * statement are not permitted. Instead, write a statement that specifies each field name. Best Practices Whenever possible, only import or transform data that has dierences, or deltas, between the source data and the target Data Extension, to isolate records that have been added or modified since the process was last executed. Ensure that Query Activities in an Automation are not executed concurrently against the same Data Extension. Contention issues will arise if two or more processes attempt to perform simultaneous operations on the same Data Extension. SQL queries that use JOIN, WHERE, GROUP BY and ORDER BY clauses should be isolated to Primary Key fields in a Data Extension, where possible. f f Reduce the complexity of your SQL queries by splitting them into individual Query Activities and creating staging Data Extensions which are used by subsequent Query Activities. For example, if you need to segment Subscribers who have opened an email in the past week, create a query on the Open Data View to return SubscriberKey values where the EventDate is in -12-
the past seven days, then use this staging Data Extension in a preceding Query Activity. f f Don t run multiple Query Activities on the same step in an Automation. -13-
Marketing Cloud Connect Marketing Cloud Connect is a managed package for Sales Cloud and Service Cloud that provides integration with Marketing Cloud through data management, segmentation, and campaign management tools. The package also enables the synchronization of data schemas and relationships from Salesforce Objects, allowing the data to be imported into Synchronized Data Extensions at predefined time intervals. Considerations Salesforce Objects use a dierent relationship model to Marketing Cloud. Objects can have multiple relationships with other Objects, whereas Contact Builder only supports a single relationship between Objects in an Attribute Group. When Objects are imported into Synchronized Data Extensions, field relationships are remapped using a predetermined priority based on, first, predefined standard Object relationships, then standard relationship fields (in alphabetical order), and finally, custom relationship fields (in alphabetical order). Lead and Contact records imported into Synchronized Data Extensions are determined as Contacts from a billing and usage perspective. f f Date and time fields from Salesforce Objects are converted from the specified Sales Cloud or Service Cloud time zone into Central Standard Time (CST). -14-
Only the data schema is synchronized in Synchronized Data Extensions, not the record itself. A Synchronized Data Extension is read-only and cannot be updated to synchronize changes back to the Salesforce Object. Marketing Cloud Connect enables a single Sales Cloud or Service Cloud account to connect with multiple Business Units within a Marketing Cloud account. This can be upgraded to a Multi-Org Account, enabling the connection of multiple Sales Cloud or Service Cloud accounts to individual Business Units in Marketing Cloud. However, this upgrade modifies the underlying account structure and cannot be reverted. Refer to the Salesforce documentation to ensure you are aware of the implications before upgrading to a Multi-Org Account. If an email address for a Lead or Contact Object is updated in Sales Cloud or Service Cloud, the corresponding email address is not updated in the All Subscribers list. If a Lead has previously been sent an email from Marketing Cloud and is later converted to a Contact in Sales Cloud or Service Cloud, a new record will be created in the All Subscribers list for the Contact when an email is sent. In turn, there will be two records for the same person: one based on the Lead ID and another based on the Contact ID. f f Leads that are converted into Contacts in Sales Cloud or Service Cloud, and have previously opted out of email communication (while they were a Lead), will be resubscribed to receive emails when a new Subscriber record is created in the All Subscribers list for the Contact. -15-
Best Practices Only import Salesforce Objects and fields that you require for your marketing communications by using the available field and record collection options when configuring the Object for synchronization. As changes to Lead and Contact email addresses are not synchronized in the All Subscribers list, create an Automation (in Automation Studio) to update Lead and Contact records with updated email addresses into this list. As duplicate records are created in the All Subscribers list once a Lead is converted into a Contact, create an Automation to remove deprecated Lead records from this list. f f As email opt-in preferences are not migrated for converted Leads, create an Automation to update the Subscriber status in the All Subscribers list for Leads that have been converted into Contacts and previously opted out of email communication (while they were a Lead). -16-
Conclusion By design, Marketing Cloud is a powerful automation platform that leverages data to create highly personalized and relevant communications. But it s important to recognize that the platform s capabilities are inherently coupled to how it s used and what it s used for. By taking a considered approach to data design, and implementing the best practices outlined in this handbook, you will be able to fully leverage the platform s capabilities and realize the full potential of your marketing activities. -17-