WHITEPAPER Smart Enterprise Data Management A new approach to rapid, high-quality data integration, data services, and data governance
Enterprises Need a Better Way to Manage Data Today, the challenges that IT faces in providing reliable and timely solutions for data needs across a business are universally accepted and well understood. These challenges can be summed up as: Explosive growth in the volume and variety of available data (the Big Data challenge ) Why Smart Enterprise Data Management? Business agility: Respond quickly to business demands for new and reliable data Faster time-to-market: Automate on-boarding and integration of data for new products or customers Trustworthy data: Keep data relationships, lineage and governance in sync with deployed systems Accelerating demands for data in the face of faster product release cycles, complex on-boarding of new customers, and increasing use of analytics-driven decision making Proliferation of data consumers ranging from internal business analysts to customers to B2B partners to regulators There s too much data that s too fragmented, redundant, underutilized, inconsistent, hard to find, hard to understand, and growing too fast. It s combined with increasing demands for new products, slick apps, new business models, better customer experiences, and compliance to new regulations. Though the challenges are well understood, solutions are not. Traditional tools for and approaches to enterprise data management (EDM) typically yield inconsistent, ambiguous, inefficient, and unreliable data access. Even when IT has the funding to respond to these challenges with big projects featuring larger MDM implementation teams, support for more content sources, or more extensive ETL deployments, projects still fail to deliver quality data to those who need it when they need it, and with a reasonable return on investment. We re reaching a breaking point: There must be a better way. A Smarter Approach to Enterprise Data Management Smart Enterprise Data Management (Smart EDM) is a new paradigm for managing enterprise data. It is an effective approach for 2 Smart Enterprise Data Management
integrating data from varied sources and exposing data via reusable services. Smart EDM manages enterprise data at the conceptual level its essential meaning as understood by IT and business people regardless of how fragmented, incomprehensible, or inconsistently the data is actually stored across or beyond enterprise systems. Its benefits are many, including: It provides increased understanding with correspondingly improved data governance, traceability and quality. Business processes suffer fewer errors and rework because of bad data. Compliance initiatives are easier to monitor and control. It s faster to bring new products and services online, or to deploy new web and mobile apps. Customers and partners can easily access data they want, offering a premium experience. Software implementing the Smart EDM approach is characterized by: A Common Conceptual Business Model based on semantic data science and industry standards is at the core of the solution. What is Smart EDM? 1. Use Common Conceptual Business Models to manage data 2. The model glues together related data 3. Automates data integration and data services 4. Operationalizes data governance 5. The model allows domain experts to access their data easily The model acts as data glue to link together physical systems and schema, operational ETL jobs, data services, and governance artifacts by their common business meaning. The model is used directly and declaratively to automate creation of executable data integration jobs and data services. Data, metadata, and model governance is an integral part of the system, including versioning, stewardship, policy management, requirements management, and tracking data use and lineage. Business analysts, decision makers, and other subject-matter experts can directly use the model for data discovery, search, and navigation. Smart Enterprise Data Management 3
Common Conceptual Business Models What s in a name? While the CCBM is a type of common model based on semantic science, some experienced technical people may be familiar with the idea of using common models to simplify data integration and refer to it as a canonical model. The foundation of Smart EDM is the Common Conceptual Business Model (CCBM). The CCBM is a shared definition of the essential business meaning of data and how it is related to other data. It is the organizing principle that simplifies fragmentation and redundancy in enterprise data. It improves productivity by revealing meaning, relationships, usage, and lineage, and by automating development tasks. The CCBM is: Common. The CCBM highlights entities and data elements that are shared across business processes, but it does not force a one-size-fits-all, rigid, enterprise-wide definition of these entities. Conceptual. The CCBM represents data at a conceptual level, independent of any particular structural or syntactical constraints imposed by the data s physical representation in a specific application or database. Business-oriented. The CCBM models entities and processes from the business s point of view, using the vocabularies, relationships, and contexts familiar to them. The CCBM builds on and extends trends towards using industry standard conceptual models to help contextualize and understand data. 4 Smart Enterprise Data Management
The Model Glues Together Related Data Different business entities in physical systems actually share many of the same concepts, meanings, and relationships. For example, while a sales-force automation (SFA) tool collects leads, a quote system generates quotes, an order-management system (OMS) processes orders, and a billing application is responsible for invoices. But even though each entity has its own distinct attributes, leads, quotes, orders, and invoices all share some common, essential semantics of a deal. The CCBM attaches business meaning to data. In this way, data representing a business concept (e.g. lead) is connected to other data also representing or related to that concept (e.g. order), no matter the physical location, format, or identifier used in production systems throughout the enterprise. The CCBM thus acts as a hub for gluing together common business concepts with their physical expression in production systems and databases. Semantic Data Science Enables Flexible, Bottom-up Models Common Conceptual Business Models are based on the latest advances and standards in semantic data science. These advances enable highly descriptive data modeling using standard languages and techniques, such as the Semantic Web standards from the World Wide Web Consortium (W3C). These standards were built to support data models at the scale and diversity of the entire Web, and as such are extremely effective at breaking down data silos and complexity at any enterprise scale. The CCBM acts as a hub for gluing together common business concepts with their physical expression in production systems and databases. The old concept of a single, grand schema, model, or database has never been practical, and semantic data science offers something different. Using standard, proven languages such as RDF and OWL, the CCBM is actually a federated collection of models, sub-models, and views that are context-specific to a given business domain or process, yet can all be linked by the common ideas represented across all the data. Smart Enterprise Data Management 5
Furthermore, unlike traditional data models and schemas, semantic data standards are based on graphs (networks) of related concepts, an extremely flexible structure that allows models to be modified and extended in myriad ways at any time. These models are wellsuited to address changing business requirements or to accommodate unexpected data from customers, partners, vendors, or even unstructured content. Automation Drives Productivity and Governance IT organizations often face a tension between data governance initiatives and operational data processes. The former is primarily concerned with ensuring high quality, trusted, and well-understood data by cataloging schemas and metadata, by documenting requirements, and by tracking data lineage. The latter is primarily concerned with the speed and cost of bringing useful new data onboard, whether in the form of data warehouses, data marts, or data services. Data migration projects often face quality issues from a lack of effective governance, and data governance initiatives often result in pristine documentation that quickly turns stale and is out of sync from the realities of runtime data processes. Semantic data science delivers Common Conceptual Business Models that are both human- and machine-readable, so they can effectively and automatically address the needs for both humanoriented data governance and documentation and machine-oriented operational data processes. Business analysts can declaratively specify both business and data requirements in terms of the CCBM, and these requirements can directly generate service and integration implementations. Conversely, because data process implementations are generated from and glued back to the CCBM, data lineage and usage is fed back into governance processes as a by-product of integrating data or consuming data services. This saves time and effort in deploying new runtime data capabilities and also ensures that the business meaning of the data is consistently preserved as data flows through the enterprise. 6 Smart Enterprise Data Management
Conceptual Models Make Everyone s Job Easier The CCBM abstracts away the underlying complexity of physical data storage, and since it s directly used for design and development, building things with data is much faster and easier. For example, analysts working on an integration project need only know the CCBM and the source database they need to integrate, not any (or every) target system of the project. And because the CCBM is glued to the data in physical systems, it can be used by analysts to help discover data and its relationship to other data throughout the enterprise for purposes such as impact analysis or redundancy analysis. By leveraging underlying business concepts as the common thread, the CCBM makes sense of the fragmented silos of redundant, often unintelligible data strewn across the enterprise. Top-down, Bottom-up, and Industry Standard Models Smart EDM does not require a top-down, one-size-fits-all approach to models and governance. Instead, the agility of the Smart EDM approach comes in part from building out the CCBM via a hybrid of top-down and bottom-up modeling. Aspects of the CCBM can be derived automatically from existing enterprise standards, such as product or customer definitions in MDM implementations. Other parts of the CCBM might come from operational database schemas, SOA interface definitions, or be manually stitched together as needed. In this way, the right trade-offs between the consistency of shared assets and the flexibility of divergent ones can be struck. But in each case, semantic links and relationships between divergent assets can be maintained for lineage tracing, discovery, impact analysis, and integration, as required. An important component of many CCBMs is industry standards. Industry standards such as ACORD, FIBO, HL7, CDISC SDTM, and others are increasingly moving beyond limited interchange formats and instead specifying conceptual models and terminologies to be The agility of the Smart EDM approach comes in part from building out the CCBM via a hybrid of topdown and bottom-up modeling. Smart Enterprise Data Management 7
Key Smart EDM Use Cases Automated generation of high-quality, governed data integration jobs Reusable data services for multi-platform apps B2B / supply chain partner data exchange Efficient data migration for customer, product, or data mart on-boarding Integration and governance of Big Data and unstructured data used by various industry players for data exchange and data management. Organizations implementing Smart EDM often use these industry models as a starting point for their CCBM, which can then be extended or modified to accommodate their own unique and evolving data needs. Smart EDM in Use With Smart EDM, both IT and business analysts work with data based on a shared model of the concepts that everyone is already using to run the business. This idea transforms nearly any aspect of a large organization s data activities and provides significant time-tomarket, process efficiency, and revenue generation ROI. Smart EDM is particularly effective for two ubiquitous use case categories: Data Integration. Mapping, transforming, combining and moving data from one (or more) source to another. A Smart EDM solution will work with existing infrastructure (e.g. ETL platforms) by generating the instructions necessary for them to migrate data, while also tracking data lineage and managing requirements along the way. Data Services. Operational software that retrieves or ingests data via published APIs. A Smart EDM solution can help automate the implementation, deployment, and hosting of these services, This mean that physical data is no longer bound inside service implementation code, avoiding a maintenance nightmare. Efficient and High-quality Data Integration Via Automation and Reuse A classic Smart EDM data integration use case is the data Extract, Transform, and Load (ETL) function typically used to move data from operational databases to an enterprise data warehouse, data marts, or other stores for reporting and analytics purposes. A Smart EDM approach allows business analysts to replace point-to-point source-target mappings with mappings from source and target systems to the CCBM. 8 Smart Enterprise Data Management
Because the mappings are glued to the CCBM, they are reusable, composable, and can directly drive automated generation of executable ETL jobs. Furthermore, by decoupling pairs of source and target systems, analysts can create mapping without the high level of upfront coordination otherwise needed to get all involved data stewards to the table. Finally, the integration mappings are linked through the CCBM to upstream business requirements and data requirements and can yield always-up-to-date data lineage information as a by-product of integrations. This case is not limited to integrating data in warehouses and marts. The same approach and similar benefits apply in data migration cases. For example, when bringing a new customer, partner, or product on-board, significant amounts of data are collected that then must be properly interpreted, transformed, and provisioned into multiple front- or back-office systems. Nor is this case limited to traditional integration sources. Smart EDM is an effective paradigm for integrating Big Data and unstructured data by anchoring these unpredictable sources to the flexibility and descriptive capabilities of the CCBM. Similarly, the CCBM acts as a natural canonical model for harmonizing the meaning of messages exchanged on an ESB in a Service-Oriented Architecture (SOA) environment. That way, any Smart Enterprise Data Management 9
new endpoint connecting to the bus need only integrate with the CCBM, through which it is automatically mapped to any other endpoint already glued to the CCBM. Effectively, the CCBM does for service data semantics what the ESB itself does for service message connectivity. Common Model OMS Endpoint ESB Canonical ESB CRM Endpoint A more sensible approach to business process integration is to set up a layer of data services that can be shared and reused by all systems in a process. It s easier to guarantee data quality and consistency, and there s a common mechanism for accessing or submitting data. Trustworthy and Reusable Data Services Because the CCBM glues shared business meaning to physical systems and structures, it is ideally situated to drive the creation of data services that provide consistent, understandable, and reusable access to data assets. In this way, Smart EDM brings value to many data services applications, including: Data access services for apps and developers. Rather than reinvent bespoke point-to-point connections for every new web app, mobile app, or published API that pulls information from enterprise systems, a layer of declaratively defined, easily consumable, and reusable data services is much easier to create, use, and maintain, and it enforces consistent data use. The CCBM can even generate source code artifacts to aid in developers consuming these data services in their applications. 10 Smart Enterprise Data Management
Validation services. Missing, incomplete, inconsistent, and inaccurate data costs companies dearly. Organizations deploy elaborate Master Data Management solutions to address a slice of the problem, usually for just customer or product data. But much more than customer data needs to be right to process a sales order. In fact, valid customer data in the context of being a prospect is different than that of being a buyer, which is different from that of being a loyalty program member. (See below for more on the relationship between the Smart EDM and traditional MDM approaches.) Building validation services from the essential meaning and rules captured in the CCBM allows for contextsensitive checks that the data in any stage in a business process is complete, correctly formatted, and consistent for that stage. Business Process / Multiple Business Systems Quote Order Submit Order Approve Order Fulfill Order Update Warehouse Validate Order Service Shared business process/bpms services. End-to-end business processes, such as an order-to-cash process or trouble-ticket resolution process, usually involve multiple business systems, all of which need to share some data. The usual approach is via pointto-point, step-by-step exchanges between systems involving different sets of data transformations at each step. It s a big integration effort that can easily lead to inaccuracies via lost, misinterpreted, or unnecessarily recreated data. A more sensible approach is to set up a layer of data services that can be shared and reused by all systems in a process. It s easier to guarantee data quality and consistency, and there s a common mechanism for accessing or submitting data, making the whole process easier to maintain. Companies that have chosen to automate their processes with a Business Process Management System Smart Enterprise Data Management 11
(BPMS) should be particularly interested in this approach as by far the most time and difficulty implementing such a BPMS is data integration with existing systems. B2B gateways and extended supply chains. To reduce labor costs, shorten fulfillment time, reduce errors and exceptions, and improve customer and partner satisfaction, demand and supply processes are increasingly being automated across company boundaries. Customers and trading partners need uncomplicated ways to exchange data. The problem is, different customers and partners systems all produce different data in different data formats. Market leaders can force them all to conform to a canonical data format and interface, but everybody else needs efficient and affordable ways to exchange consistent and wellunderstood data across a diverse ecosystem of organizations. Smart EDM and Master Data Management Master Data Management (MDM) is an approach and set of technologies to maintain one master copy of an important data entity most commonly customer or product information to help ensure it is always accurate when needed by anyone or any system in the enterprise. While MDM can improve data quality and consistency, it falls short of tackling the breadth of data challenges addressed by Smart EDM and the CCBM. MDM initiatives are centralized, top-down, and disruptive initiatives that prove to be too 12 Smart Enterprise Data Management
expensive and difficult to aid in standardizing entities beyond the narrow scope of customers or products. Also, MDM s insistence on a single version of the truth limits its utility in situations in which different business functions have a legitimate need for variable, contextual views of shared concepts. Data Sharing MDM Everything about a single entity, e.g. customer Smart EDM Any shared entities in a business process or function Purpose Shared data of record Shared data services Approach Benefits Implementation Organization Instance management, match/merge technology Data accuracy Top-down, rigid governance Dedicated, centralized, specialist roles Metadata management, semantics, validation rules Integration productivity, data consistency Bottom-up, flexible, improves over time Federated or centralized, IT architect driven Embracing Smart Enterprise Data Management The Smart EDM paradigm can be adopted incrementally and rolled out on a project-by-project basis. It offers immediate value in the form of increased data integration and data service productivity via automation, but it also offers accelerating ROI via the network effect as, over time, more and more systems and processes are glued to the CCBM. The results of Smart EDM are tangible. Companies gain business agility by reducing the time and risk of responding to new business objectives, including faster product release cycles, complex onboarding of new customers and partners, complying with new regulations and policies, offering new mobile apps and ecommerce experiences, streamlining partner interactions, or enriching Big Data for analytics. Smart EDM encourages new revenue streams both directly (e.g. via licensed data services) and indirectly (e.g. via Smart Enterprise Data Management 13
The value and benefits of the Smart Enterprise Data Management approach Smart EDM Element Common Conceptual Business Model (CCBM) as organizational hub for enterprise hub Directly use CCBM to develop data integrations and data services A CCBM based on semantic data science Value and Benefits - Organizes data the way the business thinks - Data complexity in physical systems is abstracted away - Easy to understand what data means, how/where used, and how it relates to other data - Easy for domain experts to design correctly and unambiguously in one pass - Usage tracking automatically in sync with deployed systems - Analysts need know source and CCBM only, not every target - Richly expressive - Easily extended, modified, or relinked at any time without breaking deployed integrations or services Automated mappings and suggestions using the CCBM Automated generation of deployable code and artifacts Metadata management and versioning for data, models, maps, service designs Automated data use tracking as a by-product of integration and service development Collaborative governance - Faster delivery times and better quality results - Better reuse of key data assets; endpoint changes cascade automatically to all affected integration models - Saves time, eliminates steps, produces better quality results with fewer errors and rework - Comprehensively track relationship between data and its uses - Always up-to-date since models are used directly for development and deployment - No extra time or cost overhead to track data usage - Extremely useful for lineage tracking and impact analysis - Metadata and tracking stays in sync with deployed systems - A federated, social, more practical approach to governance Unstructured and Big Data Incremental adoption - Enriches un/semi-structured data with metadata - Map any data to CCBM - No disruptive and risky build-out - ROI accelerates with each project 14 Smart Enterprise Data Management
enhanced customer experience tied to premium services and increased customer loyalty). Smart EDM also improves the operational efficiency and trustworthiness of decisions made based on data. Service reuse improves the consistency and quality of the data, and the origins of the data are easy to trace. In turn, this enhances user satisfaction and enables and encourages user self-service, improving results and driving down costs. Furthermore, often analysts with no programming skills can use the CCBM to discover data, trace its origins, see how changes would impact other systems, configure dashboards to analyze metadata or data samples, and use the CCBM itself to precisely and accurately define the data requirements for new integrations and data services, regardless of where and how the physical data is actually stored. Business analysts also can use the CCBM to unambiguously define data integrations and data services that are automatically compiled to executable artifacts, yielding significant savings on development and QA costs. Smart EDM offers a transformative yet efficient and non-disruptive approach to placing robust, reliable, and meaningful data into the hands of the people who need it when they need it. By leveraging Common Conceptual Business Models made possible by semantic data science, IT organizations can move away from the costly dichotomy that divides data quality and governance initiatives from operational data activities. To Learn More Contact Cambridge Semantics: information@cambridgesemantics.com http://www.cambridgesemantics.com/ Smart Enterprise Data Management 15
Smart Enterprise Data Management A new approach to rapid, high-quality data integration, data services, and data governance About Cambridge Semantics Cambridge Semantics provides the award-winning Anzo software suite, an open platform for deploying Smart Enterprise Data Management solutions. Enterprises face an increasing need to rapidly discover, understand, combine, and act on data from diverse sources both from within and across organizational boundaries. Anzo makes it easy for both IT and end users to deal with this need by rapidly creating solutions that leverage Common Conceptual Business Models for data integration, migration, on-boarding, governance, and creation of data services. About Anzo Smart Data Integration The Anzo Smart Data Integration software suite uses a Common Conceptual Business Model to help customers dramatically increase the speed of completing high-quality, governed data integration and data on-boarding projects. With Anzo, business analysts use an intelligent Excelbased interface to create reusable mappings between source/target systems and a conceptual model. Anzo automatically compiles these mappings into ETL jobs that run on popular 3rdparty ETL engines and also tracks data lineage, business requirements, and other integration project governance details. To learn more, email information@cambridgesemantics.com. All rights reserved.