White paper Planning for SaaS Integration
KEY PLANNING CONSIDERATIONS: Business Process Modeling Data Moderling and Mapping Data Ownership Integration Strategy Security Quality of Data (Data Cleansing) Capabilities needed from the saas integration layer: Transformation Transport Protocols Workflow Bulk Data Load Transaction Failure Compensation Identity Correlation Adapters Planning for SaaS Integration There are many planning considerations to account for when integrating with Software as a Service (SaaS) solutions. SaaS integration is often underestimated in complexity and effort. In many aspects, integrating with SaaS is similar to integrating with a business partner s externally hosted system, and less like integrating two internal systems. This white paper covers key planning considerations related to data modeling, data mapping and security, as well as key capabilities needed in a SaaS integration layer. What is SaaS (Software as a Service)? Software as a Service solutions have been on the market for several years now. Vendors have varying definitions and many vendor solutions have recently been rebranded as SaaS solutions. This paper will start by establishing key elements of SaaS products that distinguish them from packaged software, web applications, cloud computing and traditional application hosting: SaaS solutions are centered around specific business functionality, for example: Sales Force Automation (Salesforce.com ) ERP (NetSuite ) HR (SuccessFactors, Workday ) Procurement (Ariba ) E-mail and collaboration (Google TM Apps and Gmail TM ) Supply Chain Management (i2 ) Payment Processing (Amazon Flexible Payment Services) SaaS Solutions are highly configurable, not just from a user interface standpoint, but also extensible through the data model, an API (Application Programming Interface) or custom code. The SaaS licensing model is typically pay-per-use or subscription-based. This could translate into lower up-front costs compared to packaged software. SaaS solutions are managed and hosted by a third-party, frequently in a multitenant environment. SaaS applications fall into the category of cloud computing. Cloud computing has evolved to be used as a broader term that includes infrastructure and platform as a service products (e.g. Amazon EC2, Microsoft Windows Azure, Rackspace Hosting, etc.), while SaaS refers to a specific software application and underlying data available through the cloud (e.g. Salesforce.com). 2 Summa White Paper: Planning for SaaS Integation 2011 Summa
Key Planning Considerations The following are key areas to be addressed as part of SaaS integration: Business Process Modeling Data Modeling and Mapping Data Ownership Integration Strategy Security Quality of Data (Data Cleansing) Business Process Modeling Proper implementation of a SaaS solution provides the opportunity to modernize business processes and provide users with an integrated experience. To realize the full potential of SaaS, SaaS implementation projects must start with business process modeling. This includes taking steps to identify and develop a high-level understanding of changes in business processes and business level interactions between systems that leverage SaaS and internal systems. Process models should be developed with key stakeholders on the business side to understand and to take into account the longerterm vision and its potential business integrations (and avoid integrating unneeded processes and data). Data Modeling and Mapping In most cases, the SaaS vendor hosts data storage. The underlying database is often hidden and not directly available for integration purposes. Service and data integration must be performed through either API calls or bulk data loading. While bulk loading can be a good approach for handling initial loads and batch synchronizations to the SaaS system, a true integration can only be achieved through the vendor provided API calls. Such integration requires establishing a mapping between the SaaS data model and existing systems data model. This requires an understanding of the entity relationships and being able to create a data mapping and translation between two or more data models. Normally, there will not be a direct one-to-one mapping among the multiple data models. This is where the SaaS solution s customization capabilities can be leveraged. Some SaaS solutions allow extension of their data model by creating custom objects and custom relationships. In addition to establishing a data mapping between the models, it is also necessary to correlate unique IDs between the systems (for example, connecting a customer ID in CRM to the same customer s ID in an ERP system). There are several options for accomplishing this. One option involves storing external system IDs within the SaaS database. Another option is creating a cross-reference mapping outside the SaaS system in the integration layer. 3 Summa White Paper: Planning for SaaS Integation 2011 Summa
Data Ownership Even if the SaaS solution becomes the new system-of-record, it may not be the owner of all the data. Based on the business requirements, there could be a subset of the data that needs to be owned or maintained from another system. The easiest solution is one system-of-record for each data set. However, it is often the case that data updates may originate from multiple sources. It is critical to identify such requirements early on, as they can impact the scope of integration, making it more complex. Integration Strategy There are many factors that can impact the integration strategy and the total cost of ownership. The integration approach will depend on business requirements which may also be influenced by constraints of the SaaS product and legacy systems. Here is a summary of key decisions that need to be considered when determining the integration approach: Real-time vs. batch integration. Integration can be done real-time through small transactions or through batch synchronization. Batch synchronization is a feasible option only when a delay is acceptable and there are no expected concurrent updates (i.e. integration is one-way). A direct and real-time integration in most cases will be the ideal solution as it provides the most up to date information to the user while minimizing the potential for duplicate or inconsistent data across systems. Push vs. pull. Another factor to account for is which system initiates the calls. A live integration will require the SaaS product to initiate calls to internal systems through the firewall. The SaaS application may not necessarily provide this capability. Even if it does, it is often desirable to maintain this logic in house as it may require complex transformation and workflow logic. Additionally there may be security implications for allowing an external system to initiate API calls through the firewall. Direction (one way vs. bi-directional). As mentioned above, depending on data ownership, it may be necessary to support a bi-directional integration. Establishing schemes for the proper correlation of data across systems in order to maintain uniqueness and to merge changes is essential. Frequency. Depending on the frequency requirements for synchronizing data, a full integration layer may not be required. This may be the case if the synchronization will only need to occur as a one-time only load, or via infrequent bulk loads. In such scenarios, using data bulk load capabilities built-in to the SaaS product or an ETL (Extract, Transform, Load) tool may be sufficient to satisfy the integration needs. 4 Summa White Paper: Planning for SaaS Integation 2011 Summa
Partial vs. full entity synchronization. Typically, only certain fields or a subset of entities will need to be synchronized between SaaS and existing systems. This may significantly impact the scope, complexity, capacity and performance requirements of the integration solution. The fields and entities that need to be synchronized should be identified early on, along with the data mapping. Security In terms of security, as in any application, there are two main areas that must be accounted for: Authentication. The SaaS vendor will provide their own built-in authentication capabilities. However, in many cases single sign-on will be required. Because of this, it is critical that the SaaS vendor provides single sign-on or an API for delegated authentication. Single sign-on is often implemented using SAML (Security Assertion Markup Language). Authorization. Authorization must be provided at a role based level. Some SaaS solutions provide record level authorization. It is also necessary to examine and understand access policies that span systems. For instance, data that is integrated from an internal system to a SaaS solution that is secured with SaaS software authorization rules may unintentionally expose sensitive internal data to users on the SaaS system. Two authorization protocols that are emerging as standards are OAuth and AuthSub. Both standards are similar in that they allow users to grant restricted access to partial data or resources to a third party application without revealing the user credentials to the third party application. The user authorizes access from this external application once and on subsequent access the external website acts on behalf of the user by submitting a special token. Such standards are supported by popular services such as Google s Web Apps, Netflix and Twitter TM, among many others. Quality of Data (Data Cleansing) A key area that is often overlooked early in SaaS integration planning is the quality of existing data. In most cases, constraints and intentions for use of the data in the new SaaS application are very different than the intent of data in existing legacy and connected systems. Business stakeholder awareness and ownership of data quality issues are critical to project success. SaaS vendors may impose certain constraints such as required fields that may not be available on the existing data. All of these constraints need to be identified early on, so that the data can be improved prior to the integration or the integration approach is designed to accommodate the data quality issues with separate cleansing products or approaches. Master Data Management (MDM) is one such approach that can help manage and consolidate enterprise-wide data (e.g. customer and product information). Trillium Software is a popular MDM solution which can be used for improving data quality and data governance. 5 Summa White Paper: Planning for SaaS Integation 2011 Summa
The SaaS Integration Layer Depending on the business requirements and the integration capabilities of the chosen SaaS product, the integration approach may not be trivial. While a comprehensive API offered by SaaS is a must-have, in most scenarios a custom SaaS integration layer will be needed to comply with SOA (Service Oriented Architecture) principles and to facilitate integration with existing systems. This section provides an overview of capabilities that a typical SaaS integration layer needs to provide, as well as which integration products best fit those capabilities. The following diagram depicts what a SaaS Integration layer may look like and how it interacts with existing systems and the SaaS solution: SaaS SaaS Integration Layer Existing Applications Internet (Cloud) ETL HTTP/S SOAP, REST Firewall and/or SOA Appliance BPM Existing Web/ Clustered Enterprise Apps. SaaS Provider (Multitenant Environment) ESB Message Queues DB Legacy/Mainframe Enterprise Apps. End Users Road Warriors Intranet Users 6 Summa White Paper: Planning for SaaS Integation
From an implementation perspective, SaaS integration projects tend to be very similar to typical integration projects. Unless the integration requirements are very simple, it will make more sense to use an integration product as the foundation of the SaaS integration layer, instead of custom building homegrown integration middleware. For many organizations, there will be an off-the-shelf integration product or a defined architecture strategy established within an organization which should be leveraged for integrating the SaaS solution. Sometimes, the SaaS Integration layer may be comprised of one or more integration products, and an Enterprise Service Bus (ESB) is often a good integration product that will address most of the integration needs. Because there is no one-size-fitsall integration approach, consider a solution that will address most of the following integration needs: Transformation Transport Protocols Workflow Bulk Data Load Capabilities Transaction Failure Compensation Identity Correlation Adapters Transformation This is perhaps the most basic requirement that an integration layer will need to provide. One of the main goals of the SaaS integration layer is to abstract and hide internal details about the SaaS solution from existing applications. One such detail to shield from exposure to other systems is the SaaS data model. Hiding the SaaS data model will help to mitigate downstream upgrades or migration pains in future releases. The integration layer will be responsible for transforming data to and from the SaaS data schema to the enterprise schema and the other way around. More often than not, this is accomplished through the use of a canonical or intermediate schema based on industry or corporate standards. Transport Protocols Support for multiple transport protocols is a must. This is an area where good ESB products excel. The supported protocols should range from the message queues (e.g. IBM WebSphere MQ, JMS) through HTTP/S, S/FTP (Secure File Transfer Protocol), file directories and proprietary packaged application integration adapters. A good integration product will provide retry and error handling capabilities, timeout intervals and the ability to customize certain parameters such as specifying encryption protocols and credentials (e.g. certificates). 7 Summa White Paper: Planning for SaaS Integation 2011 Summa
Workflow Most ESB products often provide basic workflow capabilities. If there is a need to support complex business workflow, an ESB may not be the best integration approach or will need to be augmented with a separate BPMS (Business Process Management System). BPM systems are better equipped to handle long running transactions, i.e. workflows that require manual and/or offline steps, thus requiring persisting state of the transaction within for some period of time while people or other systems respond. They are also better suited to perform conditional steps based on the completion of previous steps (e.g. only complete the transaction if required approvals are received). Bulk Data Load Capabilities If there is a need for heavy bulk load capabilities, a third integration product category should be considered: ETL (Extract, Transform and Load). ETL solutions are often used for large direct database-to-database data loading. Frequently, a SaaS solution will not allow direct access to their underlying database and in almost all cases will be hosted by a third-party (i.e. the SaaS provider) in the cloud. In some instances, the SaaS system may already provide basic data load capabilities which may be sufficient. If that is not the case, ETL products can help significantly, especially if the data transformation is very complex. Traditional ETL vendors, such as IBM and Informatica, also offer adapters for select SaaS platforms. Transaction Failure Compensation Most services in a SOA (Service Oriented Architecture) integration do not provide support for two phase commit/rollback of transactions across services. A common approach to provide this capability is through compensation paths. A compensation path may delete a record that was recently inserted as one of many steps from a transaction that spans multiple systems. BPM systems provide support for compensating transactions through the WS-BPEL (Web Services Business Process Execution Language) standard, which includes definitions for compensation handlers. Frequently, a manual intervention process may be required for legacy systems that cannot be modified to support the process needs. Identity Correlation The SaaS solution, as well as the existing system, will most likely have their own identifiers for data elements. Many integration use cases will require correlating among these identifiers. This correlation can be delegated to a system at either end of the integration (e.g. the SaaS system, or one of the existing systems). It is important to keep in mind that there may be additional systems with their own identifiers brought into the picture later. Because of this, a good practice is to create a cross-reference database that is independent from the end systems. The SaaS integration layer (or existing integration layer) is a good fit to expose this cross-reference database as a service to other systems, i.e. to allow establishing new cross references, or to query for existing ones. This approach reduces the need to carry multiple IDs throughout the integration flows. 8 Summa White Paper: Planning for SaaS Integation 2011 Summa
About Summa Since 1996, Summa has been providing highimpact IT consulting services and customized, commercial-grade software development for companies ranging from regional businesses to Global 2000 firms. Summa specializes in helping companies evaluate and implement IT modernization strategies to better meet their business objectives and is an industry-leading provider of Service Oriented Architecture (SOA), portal and BPM solutions. To learn more about our SaaS/Cloud Integration Practice, contact us at saas@summa-tech.com or visit www.summa-tech.com. Adapters Most integration products provide built-in or additional integration adapters that are tailored for a specific product or technology. The SaaS vendor may also provide its own set of adapters. For instance, Salesforce.com offers an AppExchange TM marketplace which provides over 900 application extensions for Salesforce.com, some of which provide additional integration capabilities for external products or technologies. Cast Iron Systems (recently acquired by IBM) takes this approach to another level with the OmniCast TM platform. OmniCast is an integration solution delivered as an appliance or as a service that provides many of the capabilities mentioned above, but it is tailored for SaaS Integration. It provides pre-configured integration templates for Salesforce.com, NetSuite, GoogleApps and Microsoft Dynamics TM along with adapters for many traditional packaged enterprise software products such as SAP, Oracle E-Business Suite, JD Edwards and Siebel. Summary SaaS solutions are quick to implement and often provide faster go live implementation timeframes than traditional packaged software. They do not require procurement and setup of new servers, time for installation or system administration costs. A quick proof of concept can easily be up and running within a few days, allowing users to focus on the functionality. For the SaaS functionality to be useful and efficient, it must integrate with other business processes and data. As noted throughout this white paper, properly integrating a SaaS solution with existing systems is not as trivial a task as many vendors may claim. When undertaking a SaaS project, the SaaS integration requirements, design and implementation should be considered and tracked as a separate sub-project with dependencies and tasks assigned to corresponding teams. This will help ensure that the integration aspect is not underestimated and the full value of the SaaS solution may be realized. Planning and designing for a SaaS integration layer upfront is crucial for a successful delivery of the overall SaaS implementation. About The Author Jorge Balderas is a Distinguished Technical Consultant for Summa with over 10 years of experience in the architecture, design and development of mission-critical application and software development projects. Jorge blends and applies strong technical skills and project leadership to Summa s clients across all aspects of the software development lifecycle. He has a strong understanding of Service Oriented Architecture (SOA), business application integration and software development methodologies. 9 Summa White Paper: Planning for SaaS Integation 2011 Summa
References Cloud computing http://www.infoworld.com/print/34031 System-of-record http://www.information-management.com/issues/20030501/6645-1.html?zkprintable=true ETL (Extract, Transform, Load) http://www.computerworld.com/s/article/print/89534/quickstudy_etl SAML (Security Assertion Markup Language) http://saml.xml.org/ OAuth http://oauth.net/about/, http://code.google.com/apis/accounts/docs/oauth.html, http:// developer.netflix.com/docs/security, http://apiwiki.twitter.com/oauth-faq AuthSub http://code.google.com/apis/accounts/docs/authsub.html XML (extensible Markup Language) http://www.w3.org/xml/ Service-Oriented Architecture http://www-01.ibm.com/software/solutions/soa/ Enterprise Service Bus (ESB) http://www-01.ibm.com/software/integration/wsesb/v6/faqs.html#do Canonical Schema http://www.soapatterns.org/canonical_schema.php IBM WebSphere MQ http://www-01.ibm.com/software/integration/wmq/ JMS (Java Messaging Service) http://java.sun.com/products/jms/ BPMS (Business Process Management System) http://www.bpm.com/what-does-bpm-actually-mean.html WS-BPEL (Web Services Business Process Execution Language) http://www.oasis-open.org/committees/download.php/23964/wsbpel-v2.0-primer.htm AppExchange http://sites.force.com/appexchange/ Cast Iron OmniCast http://www.castiron.com/integration-solutions/ Summa Blog SaaS Integration Series http://www.summa-tech.com/blog/series/saas-integration/ 10 Summa White Paper: Planning for SaaS Integation 2011 Summa