Big Data Success Step 1: Get the Technology Right TOM MATIJEVIC Director, Business Development ANDY MCNALIS Director, Data Management & Integration MetaScale is a subsidiary of Sears Holdings Corporation
Speaker Overview TOM MATIJEVIC Director, Business Development Responsible for business development, client acquisition, and long-term customer profitability Collaborates with organizations across diverse industries to help deliver value from big data giving business users throughout the enterprise access to more data faster than ever Over 25 years experience with a diverse background that includes data center infrastructure, entrepreneurship, marketing, and business development ANDY MCNALIS Director, Data Management & Integration Leading member of the team that builds, deploys and manages an enterprise-scale Hadoop platform at Sears Member of the Sears / MetaScale Big Data Center of Excellence Involved in the development of design best practices for Hadoop and production-ready big data environments Spent over seven years as a Data Warehouse Manager at Sears, managing Teradata, Netezza, Greenplum and other data warehouse environments 2 2 2
MetaScale A Company of Sears Holdings MetaScale grew out of a Fortune 100 enterprise that leverages the Hadoop ecosystem to manage several petabytes of data and new business intelligence capabilities We manage reliable production environments for our parent company and other enterprise customers: Over 3 petabytes of effective storage Over 500 nodes in multiple data centers Hadoop, Storm, Kafka, Cassandra in production Our Big Data Center of Excellence is comprised of 200+ practitioners with deep practical experience in designing, developing, and integrating production Hadoop and NoSQL solutions. Tap into the best enterprise Hadoop talent available in the marketplace! 3 3 3
Over a Century of Innovation A Fortune 100 company, nearly $40 billion in annual revenue The nation s fourth largest broad line retailer with almost 2,500 full-line and specialty retail stores in the US and Canada A front runner in big data efforts including driving personalized marketing and generating savings from legacy migration Running one of the biggest rewards programs that captures and analyzes large volume of customer transactions quickly 4 4 4
Sears A Technology Perspective Online Mobile In-home Membership In-store 5 5
Everything is Great What Happens at the Backend? Hundreds of Nodes of Hadoop clusters Massive Virtual Private Cloud computing Hadoop Highly optimized shared Databases Over 5 Petabytes of Data Billions of transactions and Price changes NoSQL databases with Hadoop 6 6
The Classic Enterprise Challenge Constant pressure to lower costs, deliver faster, migrate to faster processing cycles or real time and answer more difficult questions Tight IT budgets Growing data volumes Shortened processing windows Latency in data The Challenge Escalating costs ETL complexity Demanding business requirements Hitting scalability ceilings 7 7
It All Started With a Pricing Problem The Challenge Intensive computational and large storage requirements Needed to calculate item price elasticity based on 8 billion rows of sales data Could only be run quarterly and on subset of data Needed more often Business need - React to market conditions and new product launches 1.4B SKUs across 3,400 Sites Technology Stack Legacy Systems - Mainframe Teradata / Exadata All others The Solution Implemented a Hadoop and Cassandra based infrastructure Reduced Price Elasticity Calculations to weekly Increased data set volumes and granularity Meet Business Requirement SLAs 8 8
Data Warehousing Tools and Products DATA WAREHOUSE Traditional Proprietary High costs licensing, support, etc. Teradata Market leader Solution includes Software and Hardware Works as advertised Costly to scale Greenplum (EMC/Pivotal) Software only option (but they do have an appliance) Offered MapReduce Columner projects Netezza (IBM) 9 9
Big Data Tools and Products BIG DATA Open source Many contributors Less expensive to deploy as compared to traditional Data Warehouse vendors Hadoop HDFS Hadoop Distributed File System Map/Reduce processing framework Batch centric Becoming more database like NoSQL HBase Cassandra MongoDB Real-time analytics Scales easily and inexpensively 10 10
Right Tool for the Right Job 11 11
Enterprise Data Hub 12 12
Modernizing Legacy Systems Mainframe MIPS Optimization Eliminate ETL Bottlenecks Mainframe batch business process would not scale needed to process 100 times more detail to handle rollout of high value business critical functionality. We migrated sections of batch process from mainframe to Hadoop and back, eliminating MIPS and improving overall cycle time without disruption to business users. ELT X -- Extract Load Transform, Transform, Transform ETL processing with DataStage platform was taking over ten hours to complete. With Enterprise Data Hub model on Hadoop, were are able to use Pig to complete the transformations in less than an hour. 13 13
Browsing History Hadoop and HBase 14 14
Customer Analytics with Social Media Data Brand Perception and Sentiment Analysis Brand A Brand A Brand B Brand C Brand A Brand B Mixed Brand C Brand C Brand Perception Comparison Brand B Analyze conversation sentiment about multiple brands 15 15
Key Takeaways Enterprise Data Hub and single version of truth for all data Hadoop can help you answer questions that were difficult or cost prohibitive to answer before Hadoop can transform your organization s approach to how you use data and ask questions you never even thought of Must have a clear strategy and long-term plan Leverage the right partnerships to achieve your goals 16 16
How We Help Achieve Long Term Success Start Small >> Define Success >> Strategic Partnership >> MetaScale takes a holistic approach to advising organizations in the development of use cases with clearly defined success criteria. We design and build relevant business cases, custom big data programs and long term strategy to meet current and future analytics requirements. Our proven methodologies, best practices, patent pending tools and experienced resources make us your best strategic partner for accelerating your big data initiatives. 17 17
Your One-Stop Big Data Helpline For further information phone: email: visit: 1-800-234-8769 contact@metascale.com www.metascale.com 18 18