Big Data Analytics Empowering SME s to think and act Author: Haricharan Mylaraiah Chief Operating Officer 1320 Greenway Drive, Irving, TX 75038
Contents Executive Summary... 3 Introduction... 4 What Big Data means to our Organization?... 5 How do we handle such complex and continuously emerging technologies?... 6 Privacy, Security and Regulatory Considerations... 7 What are the solution alternatives available to me for Big Data?... 8 Will it cost us fortune to run behind Big Data?... 9 How do we get access to High quality resources and teams?... 10 Conclusion... 11 2
Section 1 Executive Summary Big Data continues to be the most trending and discussed topics across technology and business forums in the last couple of years. It has transformed itself from pipe dream of few enterprise organizations to reality with most of them wholeheartedly embracing it. Many of them have also started experiencing benefits of such implementations and have made them race against their competition who have failed to adopt the newer technologies and tools. The topic of discussion of this whitepaper is the question that is confronting the SME s Is Big Data Analytics relevant for an organization of my Size? As most of the SME s never think about these solutions, as they perceive it as implementing the necessary infrastructure for number crunching massive sets of data, building team of data scientists and managing them is beyond their reach, Saxon Global is architecting solutions and services which are perfectly tenable for SME s to adopt with ease. Saxon Global partners with industry leading vendors of Big Data such as Hadoop, Cloudera, VoltDB and Splunk to create a unique technology stack to address any big data business problem. Saxon Global leverages on its proprietary analytics platform to build rich set of business intelligence and insights system integrated with enterprise systems (eg: CRM, ERP) to effectively deliver value for your business. 3
Section 2 Introduction Big Data data gathered from non-traditional sources such as social media, websites, blogs, and news feeds, online entertainment channels, e-mails, etc. They are unstructured and arrive in huge volumes but have the potential to drive deep insights to customer interactions, competition moves and partner actions. The data thus generated lead you to answer questions in a proactive manner. They perfectly complement and enhance the quality of information that is gathered through the off-line transactional systems. This ability to perfectly blend the data from multiple sources, multiple formats and multiple locations is what is making big data technologies and tools unique. The current study targets any customer who is a SME but can take the benefit of big data analytics to keep them in league of larger players in their segment. The common challenges faced by these enterprises while making such a paradigm shift is a) What Big Data means to our organization? b) How do we handle such complex and continuously emerging technologies? c) Privacy, Security and Regulatory considerations? d) What are the solution alternatives available to me? e) Will it cost us fortune to run behind Big Data? f) How do we get access to necessary high quality resources and teams? 4
Section 3 What Big Data means to our Organization? As Business leader or Technology leader of your organization you need to know what big data means to you. Let use try to demonstrate few use cases for big data for some of the industries: a) If you are a successful retailer in brick and mortar world and trying to build online presence to beat your competition that are strong e-tailers then you have to start processing data obtained from your website visits, comments customers/prospects are leaving on social media sites about your business. You need to then quickly integrate these pieces of information with your POS data or CRM to gain further insights to their purchase behavior, patterns etc. b) If you are a financial services organization and are extending your services across business lines such as mortgage, investment, insurance, banking etc and have large amounts of data sets that are being generated you will need the power of big data tools to help in Fraud detection and prevention. c) If you are a stock trading firm and provide value added services to your customers by offering tips and advice on the equity investments. Big Data helps you to provide real-time insights to changing movements of the stock with insights from the historical data. d) If you are a advertising agency and engaging variety of channels to connect your customer such as smart phones, television, websites, social media and electronic billboards then big data can help you to effectively target your audience and thereby increasing the RoA (Return on Advertisements) e) If your organization is providing call center support to any product/service, big data techniques quickly sniffs through large chunks of data and provide a unified view of the customer to the call center executive to better service the request. f) If your organization has complex IT infrastructure which forms backbone of your business then big data technologies helps you to improve enhance your SLA by effectively trouble shooting, security breach detection and future occurrence prevention. 5
Section 4 How do we handle such complex and continuously emerging technologies? Traditional enterprise data warehousing tools have hit point of maturity and are providing diminishing returns on the investments as we try to scale them to meet the newer needs of computing for analytics. This is a direct result of almost every electronic device becoming agent of communication driving exponential data volumes and requiring real-time analysis. Many popular technologies have emerged over last decade starting from some of them which were developed to act as quick-fixes to very specific business scenarios. Some of them have weeded out and few others have emerged strongly mainly due to the open source push provided couple with efficient ecosystem getting established for their growth. As the market is emerging out of its infancy organizations will require well tested and robust frameworks which can last long and also agile at same time to quickly adopt. These frameworks have to be industry standard and will need to imbibe the best practices. 6
Section 5 Privacy, Security and Regulatory Considerations Big Data has come under attack in the recent past mainly due to the following reasons: a) Anxiety b) Past Failures of Anonymization c) Concerns regarding how the profits will be shared All of us have experienced the hoax notice messages in our facebook accounts which resurface time and again. Such privacy concerns can impact adoption of big data and requires robust techniques/tools to provide such anonymization. (eg: GLBA excludes aggregate information or blind data from its privacy rules, HIPAA has well developed de-identification procedures) There is also opposition to big data from consumer protection point of view such as: i) Many suits by group of individuals who feel that companies shouldn t profit from personal information ii) Those individuals who have sought compensation for the value of their data as a marketing or analytics tool Additional concerns galore big data in form of: 1) What information to be collected? 2) Whether that information is used to score individuals? 3) Growing push to give information subjects access rights 4) Right to be forgotten need to destroy all information pertaining to individual after transaction is complete. Any organization using Big Data for enhancing consumerism will have to address the above and be well prepared to respond to any issues. This requires support from strong technical and domain advertisers in the selection and implementation of technologies. 7
Section 6 What are the solution alternatives available to me for Big Data? Predictions about Big Data for 2013 are as popular a topic as the fortune teller predicting one s personal fortunes as the New Year begins. Numerous innovations are on the anvil that will change the landscape of the big data and questioning the hitherto investments made in the traditional BI Vendors. This also provides solution alternatives for you to weigh and match your business needs. Some of them are: Real-time Hadoop: Solution allowing real-time queries on Big Data and being open-source such as BigQuery from Google, Impala from Cloudera. Their integrations and extensions will result in superior experience. Cloud Based Big Data Solutions: Likes of Amazon and others with their Elastic Map Reduce technologies are pushing down the computing power costs needed to crunch the big data. They will also drive innovative licensing models among BI tool vendors and make it accessible to SME s who in the past couldn t have afforded such specialized Business Intelligence tools. Big Data Appliances: Hardware vendors like HP, Dell are racing towards building appliances which can be single box solutions to manage several map reduce jobs. Distributed Machine Learning: Machine learning tools which can work on distributed environment can bring innovations in job handling and reduce the throughput time for running big data jobs. Analytical tools: Tools such as Rattle is making it convenient for usage of powerful open source data mining tools such as R. There are also several new players such as paradigm4, Tableau and Datameer which are commercially available for analytics. 8
Section 7 Will it cost us fortune to run behind Big Data? Big Data adoption and growth has been largely enabled because of the continuous innovations which are driving the costs southwards. There are several costs that are associated with Big Data a) Cost of Networking/Internet Cost of download falling below $150 per Terrabyte in a decade s time (Eg: Cost of downloading a movie at current speed would have costed $ 270 in 1998 and it costs $.05 now) b) Cost of Storing and Computing Cloud Infrastructure such as Amazon Ec2 investments have not only brought your cost of storage to zero but the models like on-demand provisioning of infrastructure has provide options to own the hardware as you like and when you like it. (Eg: Cost of retrieving a image and processing it from billion images is today done at cost of around $ 2000 per billion as against few 100 thousands one had to spend a decade ago). Better processing technologies have always aided in reducing the need of the required computing infrastructure to handle large scale. c) Cost of Resources and Data Scientists Service Organizations are gearing up to meet the challenge of need of specialized resources to handle the newer technologies with many certified professionals available at affordable rates. New role such as Data Scientists are emerging who can bring in more than one skill set required to handle the big data challenges. 9
Section 8 How do we get access to High quality resources and teams? The meteoric rise of Big Data has made it challenging for organizations to keep up pace with the IT trend. It has created void as there is visible divide between the current IT professionals and what it requires to handle big data. According study conducted and report published by McKinsey in 2011, U.S. alone could face shortage of 140,000 to 190,000 deep analytical people and of 1.5 million people who are capable of analyzing data in ways that enable business decisions. Like the famous Pareto s principle, we can extend and adopt it to Big Data Analytics, to state 20% of the big data benefit can be delivered by tools and technologies that we choose by 80% is on the team which we put in place to implement and use them. So, the 4 th V of Big Data what we call as the Value to the business is actually delivered by this team which has the complete know-how of the business, technologies and the techniques. Big Data talent is hard to come by as they are people who cut across IT, software development, application development and analytics. They require scientific temperament to perform low cost experiments to drive superior benefits to the organization. They come in variety as data scientists, architects, visualizers, change agents and engineers. Your organization may need all of them to really become successful with big data. 10
Section 9 Conclusion If you are a Small and Medium Enterprise and you can t afford to bring expensive analytics solutions to your organization, here are few things which can help you improve your competitive position. Look for Cloud-based solutions which offer turnkey business analytics for your vertical or domain. Speak to Big Data consultant before you make the move Make an informed choice with right evaluation of alternatives Invest it people more than technologies (STOP Superior Talent, Open Platform) About Saxon Global Saxon-Global is one of fastest growing Inc 500 Company providing IT consulting and solution engineering services. Saxon global has helped organizations across industries such as Financial Services, Retail, Telecom, Healthcare and Media Entertainment to successfully adopt IT tools and services for their continued growth. We in our constant endeavor to excel and forward looking approach have strengthened our skills and expertise in emerging areas of Big Data and Business Intelligence. Saxon Global today partners with industry leading vendors of Business Intelligence and Big Data technologies to create a compelling technology stack and drive the benefits of this to our esteemed customers through our rich talent pool of data architects and engineers. Customer delight forms the core of the Saxon Global vision and we are building on the three strong pillars of Technology competency, flexible engagement model and reduced total cost of ownership. We engage with our customers at various levels of their data lifecycle and handhold them to their success. Saxon Global is making continuous investments in creating proprietary IP s for providing custom specific solutions in reduced least times and lowest cost. Our frameworks in the past have helped organizations big and small to quickly recognize the power of data and reap huge dividends from deep dive analytics. We aim at being your trusted Data Analytics and Business Intelligence Partner to unleash power of your data. 11