1 THE BIGGER THE BETTER? Big Data in Financial Services
3 T H E B I G G E R T H E B E T T E R? BIG DATA IN FINANCIAL SERVICES
4 4 CONTENTS ABOUT THIS REPORT 5 GENERAL VIEWPOINT 6 BIG DATA PILLARS 8 Pillar 1: Enable the business to make better decisions 8 Pillar 2: Develop (or mature) the business-led data strategy 9 Pillar 3: Reduce the cost of the data supply chain 10 Pillar 4: Enable new disruptive technologies 11 CONCLUSION: LET BUSINESS DRIVE 13 THE BIGGER THE BETTER?
5 5 ABOUT THIS REPORT Target Audience This paper is designed to be relevant to producers and consumers of data in a financial institution. In particular, it is oriented towards the business use of data (e.g., analytics) and on technology techniques for managing data. The paper assumes that the reader has some context and understanding around Big Data technologies, how they would or would not apply within a financial organization, and the benefits and challenges that such new technologies present. Executive Summary Big Data is a term which has come to spark much interest and discussion. Whether hype or futuristic, one thing is certain: Big Data has entered the technology industry lexicon in a big way. At Capco, we define Big Data as an information environment to facilitate better business insight and decision-making by leveraging information assets that have high volume, velocity, and variety, and require new forms of capture, processing, and analysis. Key to the Capco definition of Big Data is that we perceive it to be a business-driven initiative; it is the business which owns the information and data assets of a firm, and not technology, and so it is the business which should champion the need to leverage Big Data as the proposed solution to specific problems. In many ways, Big Data is not about data, but about analytics, the business insight they provide, and the decision-making they facilitate. Further, while Big Data may have strong application in some areas, it may not be appropriate in all situations. However, this does not detract from the significant potential that Big Data technology offers to rationalize data management costs. Capco s research shows that most financial institutions are implementing Big Data projects over the next months, however, their objectives are varied and, at times, unclear. As with all data challenges, the commencement of any project to solve for a specific issue, whether through Big Data technologies or conventional technologies, should only be initiated once a particular challenge has been identified, and the complexities around the challenge are well understood by the business.
6 6 GENERAL VIEWPOINT Capco s research on Big Data has revealed that many financial institutions are currently implementing Big Data solutions over the next months directed at analytics, cost reduction, or risk management. Research shows that to facilitate better business insight and decision-making a financial institution must significantly mature its ability to leverage information assets that have high volume, velocity, and variety and that require new forms of capture, processing, and analysis. The research further substantiates that Big Data should not be incubated by an IT organization s insatiable appetite for the new and cool, but must be driven by the business to enhance revenue, better manage risk and/or reduce costs the same imperatives that have traditionally driven data initiatives. Historically one of the three Vs (Volume, Velocity, Variety) created particular challenges within the business and IT ecosystems. Two or more Vs start to tell the story (or horror) of Big Data. Correlation and causation targeted at enabling more effective business decisions becomes exponentially more difficult. The risk of exploding data (and associated costs) and imploding business results and value is real and growing. Big Data in consumer financial services companies is large (billions of transactions), but is only a fraction of the total data set that will be available to organizations in the future. The challenge of elevating data to information and ultimately to knowledge is increasingly difficult when underlying data sources inflate in volume, velocity, and variety. Therefore, it becomes ever more important for organizations to get this right now, or risk losing control of their data ecosystem later. There is no new paradigm for Big Data. Data is THE primary business asset in Financial Services: understanding how to deal with Big Data simply requires prudence and pragmatism in defining and executing a data strategy. Our research has shown that financial institutions take four broad approaches to Big Data: THE BIGGER THE BETTER?
7 7 Conservative Business Adoption BIG DATA TECHNOLOGY Aggresive Business Adoption Modest, incremental investment driven by specific use cases (marketing, evaluating/tuning customer interactions) Filter for end-point analytics and transactional analytics; likely to feed existing BI infrastructure Not utilized in direct support of operational processes (e.g. risk) Conservative Technology Adoption Reduce data foot print and storage cost by replacing existing SAN storage with Hadoop Cluster Standardize future-state BI capabilities on Big Data technology across business usages (marketing, risk, operations) Replacement strategy for enterprise data and analytics ecosystem using Big Data (Hadoop, NoSQL) and data virtualization technologies Conservative Technology Adoption Replacement strategy for ETL Replacement strategy for file staging and landing zones Replacement strategy for archival and long-term storage Based on the insights that we have gathered from our research, we focus on four key pillars for Big Data: 1. Enable the business to make better decisions Big Data is about analytics to drive business insight 2. Develop (or mature) a business-led data strategy Big Data is not applicable in all situations 3. Reduce the cost of the data supply chain 1 Big Data technology offers potential for streamlining and driving efficiency 4. Enable new disruptive technologies Big Data platforms provide a competitive advantage with positive cost impact This paper will explore each of these pillars and how they can help your organization effectively translate Big Data into business transforming knowledge. 1. Data Supply Chain refers to the continuum of data flow from initial capture, to organization and harmonization, and through analysis and publishing. This covers the full data lifecycle for both business consumption as well as technology enablement (sourcing, storage, access, retrieval)
8 8 BIG DATA PILLARS PILLAR 1: ENABLE THE BUSINESS TO MAKE BETTER DECISIONS Big Data is not only about new sources and greater volume, velocity, and variety of data but also is fundamentally about the ability to derive new meaning and insights. Big Data serves to complement the traditional approach to data management, e.g. data warehousing and analytics. New meaning that may not have been possible to previously uncover and new insight that existed but was previously obfuscated by the structural barriers to data capture, organization, analysis, and consumption. These barriers could be due to scale, speed, format, modularization, and distributed existence that made it hard to seamlessly link and analyze data. Historically, data analytics has relied on representative samplings, or subsets of information to derive business insight. Big Data analytics engines are shifting the focus from examining sample data to examining complete and large data sets. For example, Hadoop is able to speed up processing and analytics by breaking a data set into smaller parts and distributing it across a parallel processing and publishing infrastructure. Insights from Big Data analytics have been revealing (e.g., Walmart s determination of pop tarts being the 2nd most purchased item after batteries before a hurricane), but also misleading (e.g., Google s prediction of the spread of the flu in the US). The Google flu example highlights the dangers of relying on associative patterns without establishing appropriate relevance. Big Data or Big Analytics are most powerful in hypothesis testing. This proven approach to scientific inquiry is now buttressed by Big Data capability, especially in handling scale and speed. The greatest potential with Big Data is to make analytics come alive. The creation of new methods and tools to embed information into value systems and business processes (e.g., use cases, work flows, simulations, visualizations, analytic solutions, etc.) are making insights more understandable and actionable. While trend analysis, forecasting, and standardized reporting are common today, they are likely to be surpassed by data visualization applications (dashboards, scorecards, etc.), simulations/scenarios, dynamic analytics embedded in business processes, and advanced statistical techniques and modelling. This trend is already visible in GPS data being superimposed on real-time traffic patterns to suggest optimal driving routes. In financial services, Big Data has created new capabilities and business models for example, algorithmic trading, which analyzes massive amounts of market data in fractions of a second to identify opportunities to capture value almost instantly. Further, banks are now capturing and monitoring social media activity to identify at-risk customers as well as opportunities for service offers. THE BIGGER THE BETTER?
9 9 PILLAR 2: DEVELOP (OR MATURE) THE BUSINESS-LED DATA STRATEGY Business ownership of data is generally accepted by technologists everywhere and rarely acknowledged by business users anywhere. As with requirements definition and sign-off, the business needs to accept this responsibility because of the importance of segregating duties. Technologists are responsible for building the machinery of applications and users are responsible for ensuring that content is accurate. However, business users understand and consume data in the context of their business processes and applications, not in abstracts like governance and quality. As data flows through an organization it frequently changes state and thus meaning. These concepts and controversies aren t new - we ve been dealing with these issues in the traditional data space for years. The problem of Big Data magnifies the complexity by cranking up the velocity, volume and variety to new levels. It is our view that even more than traditional data, Big Data must to be managed and consumed at the application level. Therefore, it is imperative that data be treated like a financial asset and consistently assigned a fair value. Ownership through Segregation of Duties Ask a technologist who owns the data? and you ll get the standard answer: the business owns the data. Ask a user what it means to own the data and you ll get a blank stare. What does owning the data mean? And who is this mysterious business person? Are they in management or finance or operations? Now let s have some real fun: who owns the Big Data? Is it possible to have a bigger blank stare? Data ownership is about ensuring quality. Technologists build the platforms for establishing and reporting on quality metrics but should not be expected to validate the content of those metrics. Often, however, technologists engage in validation as they understand the importance of getting it right. This creates a material problem from a segregation of duties perspective - the team who builds the scoreboard needs to be independent from the game. We ve seen many a technology division define and report on quality metrics that were incorrect, which fundamentally undermines the program and the credibility of the players. Big data introduces a radical volume shift when diagnosing problems for both business and technologists. Consider the additional set of business rules required if a regulatory request is presented to trace data from for all feeds for all FX trades for the last 7 years because of the LIBOR scandal. Big Data technology offers many techniques for storage management and distributed processing but the patterns needed to sift through the huge volumes of data requires a new thinking. The solution requires a close partnership between business and technology business users will provide the detection logic and technologists will implement them.
10 10 The Importance of Context Business context is a critical component for managing Big and traditional Data. As data moves through its lifecycle, events occur that trigger changes in state, meaning, and usage. For example, take the scenario where a global multinational retail bank is in the middle of implementing a new account simplification marketing campaign. The project goal is to reduce the number of account types to a three-tiered model and marketing wants to understand client sentiment by interpreting their interactions across multiple websites, social media sites, and classifying the clients through demographic data, previously collected during the onboarding process. In Big Data terms this represents a dual challenge: volume exacerbated by the variety of data types involved. Technology can deliver an integrated social media and website warehouse but without understanding the marketing goals, it is just more expensive content. How many successfully delivered technology projects failed to meet business-defined goals? Aligning the partnership between business and technology to business goals through application context is a crucial ingredient to success. PILLAR 3: REDUCE THE COST OF THE DATA SUPPLY CHAIN Given the importance of data within financial services firms, one could argue that thinking about data as a supply chain from input through process to output comes naturally to market participants. Take equities trading as an example, the flow from order to execution to trade to allocation to ledger posting is generally well understood. However, whenever this supply chain breaks, through duplication caused by legacy system errors and/or mergers & acquisition activity, or similarly, suffers from a lack of master data integration, the cost footprint is high. In addition to economic pressure, the regulatory demands on the data supply chain increasingly move from an output-oriented view to a process-oriented view. In particular, BCBC 239 forces globally significant institutions to formalize and improve governance and infrastructure, data aggregation capabilities (accuracy & integrity, completeness, timeliness, adaptability), and reporting practices. None of this is terribly complicated on the surface, but it requires that institutions master the fundamentals: understand your data, control how it moves through the organization, and leverage golden source data stores to drive most reporting needs. Without these fundamentals in place, the application of inexpensive Big Data technologies risks leading to a proliferation of unsynchronized data marts with doubtful data quality being approximately right often is not enough. Business architecture leads enterprise architecture, which in turn leads data architecture. The ability to develop an effective and efficient implementation of these, as well as their use in defining and driving a future state roadmap, will allow organizations to drive long-term savings. The choice of technologies to make the data supply chain available in a federated fashion or through controlled Big Data repositories becomes a secondary question, the answer to which will change as technologies evolve. THE BIGGER THE BETTER?
11 11 PILLAR 4: ENABLE NEW DISRUPTIVE TECHNOLOGIES Disruptive technologies provide transformative capabilities at the platform level. They have the potential to incrementally or radically change how the business operates. For example, when relational databases transformed from hundreds to thousands of transactions per second, we saw the emergence of high frequency algorithmic trading systems. These new technology capabilities enabled an entirely different business, which had not previously existed. Similarly, Big Data technology has the potential to disrupt existing business as well as enable newly formulated ones. Hadoop-based technology is a compelling alternative to conventional data services infrastructure. For many organizations, Big Data technology is disrupting incumbent investments in data aggregation, data transfer, ETL, filtering and archival platforms. Our view is that the pace of adoption of Big Data technology is based primarily on the adequacy and sophistication of the bank s existing information management infrastructure. If the current data aggregation and sourcing implementation satisfies business needs, the introduction of Hadoop-based technology is modest and incremental. If the financial institution s existing data infrastructure is inadequate for cost or completeness reasons, we observe these organizations make more aggressive investments in Big Data technology as a full replacement for existing infrastructure (illustrated in Diagram 1). Data Sources Data Sourcing Storage Analytics Traditional Structured Data ETL (mostly batch) Data Warehouse & Data Marts Business Intelligence & Analytics Tools Ongoing operations and business management functions Big Data Additional Structured Data (High Volume) Unstructured Data Data Hub (near Real-time and Batch) Types of data sources Traditional information management components New Big Data components Operations Management, Security, Workflow Marketing, customer insights & experience and fraud analytics
12 12 If the current management implementation is not optimal, Big Data can be introduced incrementally to coexist with current data warehouse and analytics solutions in main two ways: As a data hub to feed data into data warehouse or (Diagram 1) As a complementary data sourcing and storage platform to existing data management infrastructure (Diagram 2) Data Sources Data Sourcing Storage Analytics Traditional Structured Data ETL (mostly batch) Data Warehouse & Data Marts Business Intelligence & Analytics Tools Ongoing operations and business management functions Big Data Additional Structured Data (High Volume) Unstructured Data Distributed File System Types of data sources Traditional information management components New Big Data components Data Retrieval & Query Operations Management, Security, Workflow Marketing, customer insights & experience and fraud analytics In both cases, the traditional information management/bi platform (the green components in Diagram 1 and Diagram 2) still serves ongoing operations and business management functions. Only marketing, customer insights and experience, and fraud analytics are sitting directly on top of the Hadoop infrastructure (the blue components in Diagram 1 and Diagram 2). THE BIGGER THE BETTER?
13 13 CONCLUSION: LET BUSINESS DRIVE Technology and business executives from all industries are drawn to Big Data, and many are now investigating how Big Data can be leveraged within their firms. What these executives often realize is that successful implementation of Big Data requires enhancements and expansions of their existing business structures, including processes, technologies, controls, and organization, to address new forms of data capture, processing, and analysis brought about by Big Data. Nonetheless, while the primary thrust of Big Data is to facilitate better analytics and business insight, it may not be the best solution for all data initiatives. Capco s research has shown that Big Data is best used for marketing, enhancing customer experiences, and operations efficiency. However, conventional / traditional approaches to data remain valid for ongoing business management. At Capco, we believe that prior to undertaking any Big Data discussions, firms must first ensure their IT organizations understand their role as a participant, rather than as the leader; IT must allow business to drive as it is business in its role as the owner of data assets, which will play the critical role of defining the firm s data strategy. However, IT can and should take the lead in capturing the potential for Big Data technology to reduce the data lifecycle cost. Thus, role and responsibility segregation must be clear business sets strategy, and technology helps realize it. Lastly, business must also realize that the Big Data marketplace is relatively new, and so the landscape is ever-changing. To truly harness and realize value from Big Data, an organization must first have a robust roadmap, with dependencies and accelerators identified so that it can be both agile and adaptive in responding to continuously changing needs and market conditions. Capco is currently working on a Big Data initiative which blends traditional data approaches and newly emerging techniques and frameworks to provide business value to our clients. Stay tuned for more insights, guidelines, and best practices specific to a range of services within financial services; from Retail Banking and Capital Markets, to Wealth and Investment Management.
14 Paul Ringmacher is a Partner with Capco s Technology group in North America. He focuses on complex technology solution development, system integration, software application development, and enterprise infrastructure design and optimization. Prior to joining Capco, Paul was a Managing Director at BearingPoint focusing solely on the financial services industry, most notably in the Banking vertical. Sandeep Vishnu is a Partner in Capco s North American Finance, Risk, and Compliance practice. He focuses on enterprise risk management, financial analysis, data, risk analytics/modeling, business intelligence/reporting, compliance, capital, and operational/control issues. Sandeep has over 20 years experience serving in principal and management roles in strategy and technology consulting. Before joining Capco, he was a Managing Director in the Global Risk, Finance and Compliance practice of a global management and technology firm. Tom Castriota is a data guy. He specializes in Data Warehousing, Business Intelligence and Analytics, Master Data Management and Data Strategy with a career that spans over 25 years. Tom s background blends Financial Services experience with Commercial Software Development practices with Advisory and Management Consulting services. Tom has served as advisor to Chief Data Officers and Data Executives for several global institutions. Tamar Tepper is a Senior Consultant at Capco specializing in Banking, Wealth & Investment Management in Capco s Toronto office. Since joining Capco in January 2011, Tamar has worked on several strategy projects, including a 3 Year Operations and Technology transformation initiative, and two large technology design, development, and implementation projects in client onboarding and a credit card conversion project. Tamar is currently working as a business analyst on a project which is institutionalizing an enterprise-wide Hadoop (Big Data) platform at a major bank. Tamar holds a Masters in Business Administration from the Richard Ivey School of Business. ABOUT CAPCO Capco is a global business and technology consultancy dedicated solely to the financial services industry. We work in this sector only. We recognize and understand the opportunities and the challenges our clients face. We apply focus, insight and determination to consulting, technology and transformation. We overcome complexity. We remove obstacles. We help our clients realize their potential for increasing success. The value we create, the insights we contribute and the skills of our people mean we are more than consultants. We are a true participant in the industry. Together with our clients we are forming the future of finance. We serve our clients from offices in leading financial centers across North America, Europe, Africa and Asia. WORLDWIDE OFFICES Amsterdam Antwerp Bangalore Bratislava Charlotte Chicago Düsseldorf Frankfurt Geneva Hong Kong Johannesburg London New York Orlando Paris San Francisco Singapore Toronto Whashington DC Zürich To learn more, contact us in the UK on , in Continental Europe on , in North America on or visit our website at CAPCO.COM