1 BIG DATA: Opportunities Ahead H. R. Mohan President Computer Society of India
2 What is Big Data Big data is an all-encompassing term for any collection of data sets so large and complex that it becomes difficult to process using on-hand data management tools or traditional data processing applications. It poses challenges in: capture, curation, storage, search, sharing, transfer, analysis and visualization. In a 2001 META Group (now Gartner) analyst Doug Laney defined data growth challenges and opportunities as being threedimensional, i.e. increasing volume (amount of data), velocity (speed of data in and out), and variety (range of data types and sources). Gartner updated its definition as: "Big data is high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization.
3 Big Data Hype Cycle
4 Big Data Hype Cycle 2014
5 Big Data Potential
7 Venture Capital Funding in BiDA
10 World s largest ever Big Data repository of electoral data Hyderabad-based Modak Analytics, a data analytics startup, built India s first Big Data-based electoral data repository of 81.4 crore voters for the recent elections held in Apr-May It had to analyze over 9 lakh PDFs amounting to over 2.5 crore pages in 12 languages. This data was mapped to 9.3 lakh polling booths across 543 parliamentary and 4,120 assembly constituencies. Source:
11 Big Data in Context Google Flu Trends is a web service operated by Google. It provides estimates of influenza activity for more than 25 countries. By aggregating Google search queries, it attempts to make accurate predictions about flu activity. This project was first launched in 2008 by Google.org to help predict outbreaks of flu. Nine days before the World Health Organization announced the African Ebola outbreak now making headlines, an algorithm had already spotted it. HealthMap, a data-driven mapping tool developed out of Boston Children's Hospital, detected a "mystery hemorrhagic fever" after mining thousands of webbased data sources for clues
12 IBM Big Data Baby For the NeoNatal Intensive Care unit at the Hospital for Sick Children (Sick Kids) in Toronto, Big Data tools have proven valuable to allow doctors to monitor the vital signs of premature infants around the clock. Watch
13 Big Data for News Feed A mobile phone is filled with features that, well directed, can significantly improve user experience and provide reams of usage data. Imagine a news feed natively produced in different formats: long, short, capsules of text, with stills and videos in different sizes and lengths. Every five minutes or so, the feed is updated. After a while, your smartphone has recorded your usage patterns in great detail using Google Now & Google Location History, It knows when you read the news and, more importantly, under what conditions. By poring over such data, analytics specialists can understand what is read, watched and heard, at what time of the day and in which environment. Do users favor snippets when commuting? What s the maximum word-length for a story to be read in the subway without being dropped, and what length is more likely to induce future reading? What s the optimal duration for a video? What kind of news package fits the needs and attention for someone on the move? What sort of move by the way? Car / Tube (Accelerometers and motion sensors can tell that) It will help to decide if it s better to serve the smartphone owner with a clever podcast while he is likely to be stuck in the car for the next 50 minutes on Highway (as revealed by the GPS patterns of the last few months) or favor text and preloaded videos for Tube commuters.
14 Big Data for Online Advertising Demographic Info: Name/ID, Age, Gender, Location Online behaviour: Click History, Ads Viewed (Impressions), Ads Clicked (Conversions) Third Party Data: Market segment to which the user belongs, Tweets, Likes Challenges: Veracity, Dynamic, Low positive response, Large no. of segments/attributes, Impatient Advertisers/Marketers
15 Some Big Data Implementations By analyzing the huge volume of data produced every day on social media, Walmart is trying to shape the future for retail Monsanto considered big data in agriculture to increase yield production, saving on seed, chemicals & fertilizers., and as we near an era of history wrought with more people and less resources, this makes farming one of the most important careers in the world. Michael J. Fox Foundation taps Intel's big data, wearables for Parkinson's research. GE announces First Data Lake Approach for Industrial Internet to Better Access, Analyze and Store Industrial-Strength Big Data with 2000x performance inprovement on analysis time thereby improve supply chains and customer service operations. GE analyzes flight data for its customers. In a 2013 pilot, GE Aviation collected information on 15,000 flights from 25 different airlines at about 14 gigabytes of metrics per flight, and by using the data lake approach achieved a cost saving of 10x and significantly reduce analysis time from months to days. GE expects the data collection to grow to 10 million flights and 1,500TB of flight data by next year.
16 Some Big Data Implementations The cloud-based app, powered by SAP's flagship HANA in-memory database, was designed for scouts, executives, coaches, and trainers for filing player evaluations and lining these up with realtime quantitative data to streamline comparing prospective players. IBM, the tech giant added another set of cloud-based analytics software products to the portfolio, this time focusing on workforce, talent, and human capital management challenges. Auto insurer Progressive collected 10 billion miles of driving data time & speed from its customers and used it beyond lowering insurance rates and extends it deterrmine which roads are the most problematic for drivers and in need of work or repair. The U.S. labour dept. predicts the standard of living could fall 9% by 2030 to hit levels in 2000 unless states deploy real-time analytics and actionable insights on labor supply and demand.
17 Some Big Data Implementations HR heads of Indian IT companies are using analytics to accurately predict career aspirations of employees, identify high performers and even predict when an employee is likely to leave. San Diego Metropolitan Transit System studies bus and trolley ridership patterns using Urban Insights' predictive analysis platform. WIFIRE, created by the University of California at San Diego and University of Maryland, crunches satellite data and real-time remote sensor data with various computational techniques to forecast the rate at which wildfires might spread Ford Motor with the help of big data and climate change models has redesigned its 2015 model of F-150 heavty duty trucks to be 700lbs lighter using aluminum alloy components similar to military and aerospace materials. Aims at improvements in fuel econnomy & vehicle emissions (used 25 to 260 GB of data per hour). Earlier, it had optimised its production & inventory with big data analytics.
18 Big Data and Indian IT Industry A joint study by NASSCOM and CRISIL Global Research & Analytics suggests that by 2015, Big Data is expected to become a USD 25 billion industry. The report estimates that the Indian Big Data industry will grow from USD 200 million in 2012 to USD 1 billion in 2015 at a CAGR in excess of 83% Opportunities for Big Data implementation and analytics outsourcing services such as Big Data technology implementation, including data collection, integration and designing of Big Data architecture.
19 Evolution of BiDA Basic Reporting MIS Decision Support Systems / Expert Systems Business Intelligence Analytics Predictive Analytics: Beneficial to most of the industry verticals such as telecom (customer churn), banking (frauds, customer engagements), retail (forecast & sentiment analysis), healthcare (detection of lifethreatening conditions), government (city traffic flow & commuting options)
20 Big Data 3Vs
24 Beyond 3 Vs of Big Data Veracity refers to the trustworthiness, biases, noise and abnormality in data. Validity meaning is the data correct and accurate for the intended use. Volatility refers to how long is data valid and how long should it be stored. Value is concerned with the business insights that the data provides.
26 Big Data Analytics Infrastructure
27 Hadoop Software & Hardware Hadoop software (open source) is designed to orchestrate massively parallel processing on relatively low-cost servers that pack plenty of storage close to the processing power. All the power, reliability, redundancy, and fault tolerance is built into the software, which distributes the data and processing across tens, hundreds, or even thousands of "nodes" in a clustered server configuration. These nodes are "industry standard" x86 2U servers that cost USD 2,500 to USD 15,000 each, depending on CPU, RAM and disk choices. CPUs with a total of 12 processors fitted with 64 GB to 128 GB of RAM. DataNodes usually have a dozen 2 TB or 3 TB 3.5-inch hard drives in a JBOD (just a bunch of disks) configuration. Some Suppliers: Cisco Unified Computing System, DELL, HP DL360P Server, IBM & Lenovo, Supermicro, Oracle Big Data Appliance, Pivotal Data Computing Appliance, Teradata Appliance for Hadoop, Cray, SGI Infinite Data Cluster, SeaMicro
28 Components of Analytics (SAP) Core: A foundation for managing BI initiatives data exploration, visualization, organization, reporting, sharing and more is core for organizations of all sizes to understand how they are performing Creative: Easy-to-use, self-service connectivity to different environments and data sources helps customers unlock deeper business insights enabling user autonomy and individual creativity Mobile: Delivery of the content required by workers in the contexts they need it wherever they are, on any device via a friendly, consistent user experience means instant answers for more people Extreme: Powerful capabilities such as real-time and predictive analytics enable organizations to tackle "big data," uncover hidden risks and reveal untapped opportunities Social: Increased collaboration allows teams to drive and capture the decision-making processes surrounding structured and unstructured data: Increased collaboration allows teams to drive and
30 Big Data for Automotive Industry Data warehouse optimization Predictive asset optimization Connected vehicle Actionable customer insight
31 Big Data for Banking Optimize offers and cross sell Contact center efficiency and problem resolution Payment fraud detection and investigation Counterparty credit risk management
32 Big Data for Consumer Products Optimized promotions effectiveness Micro-market campaign management Real-time demand forecast
33 Big Data for Energy and Utilities Distribution load forecasting and scheduling Create targeted customer offerings Condition-based maintenance Enable customer energy management Smart meter analytics
34 Big Data for Healthcare Measure and act on population health Engage consumers in their healthcare Health monitoring and intervention
35 Big Data for Insurance Claims fraud detection Next best action and customer retention Catastrophe risk modeling Usage-based insurance Portfolio management Producer optimization
36 Big Data for Oil & Gas Advanced condition monitoring Drilling surveillance & optimization Production surveillance & optimization
37 Big Data for Retail Merchandise optimization Actionable customer insight
38 Big Data for Government Threat prediction and prevention Social program fraud, waste and errors Tax compliance - fraud and abuse Crime prediction and prevention
39 Big Data for Telecom Pro-active call center Smarter campaigns Network analytics Location-based services
40 Big Data for Travel & Transportation Customer analytics and loyalty marketing Capacity & pricing optimization Predictive maintenance optimization Source:
41 Airlines Management of Seats: Manage overbooking capacity to ensure maximum seat factors with minimal downgrades Loyalty Management: Identify different customers clusters; Target clusters with appropriate messages, promotions and placement to ensure repeat revenue and reduced churn Passenger Business: Missed connections; Lost baggage; VIP customers; Up sales during the booking process; Overhead bin space optimization; Flight Operations: Aircraft re-routing; Gate management; Baggage management; Parts management; Forecasting; Flight catering management; Ground handling services; Manpower planning; Optimized staffing levels Others: Sales area, Managing traffic flows (O&D); Brand buzz analysis; Sentiment analysis
42 Big Data and Fitness The new generation of personal physical fitness tracking tools and apps combine Big Data with mobile and wearable technology. They record, save, track, and monitor the various data from their workouts and offer users the ability to connect with other users to compare their progress and support each other. Garmin offers GPS devices and training tools for golfing, running, cycling, and swimming, as well as multisport fitness bands. With detailed readings of your time, distance, elevation, heart rate and calories burned (data depends on model) you can see where your strengths and weaknesses lie. General fitness, running, cycling, triathlons choose your goal, and we ll give you a day-by-day plan for success. Get social motivation! When you connect with friends, foes or pros, they can see a feed of your activities, and you can see theirs. Source:
43 Big Data and Relationships Online dating apps and websites that utilize Big Data are increasingly being used by people in order to find dates and start relationships. They utilize data based on answers that users have provided, along with proprietary algorithms, to match people with each other. Some of them are location-based, utilizing users phones GPS capabilities to connect them with other people based on their geographical locations. In addition, social networking companies such as Facebook are analyzing user-generated data (such as the data from people s relationship statuses) to analyze trends in relationships. Skout is one of the largest global, mobile network for meeting new people. We are true believers in preserving the magic of serendipitous meetings and we make possible a wide range of social connections from friendship to networking. With your mobile device as your guide, you can discover new friends at the local neighborhood bar, at a concert at Madison Square Garden or on a bus tour in Barcelona. Our community spans more than 100 countries and taps our app to meet new people nearby or continents away.
44 Big Data and Health Big Data, wearable devices, Internet of Things and mobile technology are converging to form health analytics applications. These apps are transforming the way we manage our own health, and the way healthcare providers and companies manage their patients health. They provide individuals as well as healthcare professionals, hospitals, healthcare companies, and pharmaceutical companies new insights based on usergenerated data that will help them to make well-informed, individually tailored decisions. These apps enhance patient engagement between visits, encourage prescription medication compliance, and inform marketing strategies for the healthcare industry. In addition, some of these companies provide individuals with the opportunity to analyze their own personal health data via wearable devices, in order to adjust their choices and lifestyles in an effort to maximize their levels of wellbeing. Big Data is also revolutionizing medical research by helping researchers to visualize and analyze big datasets that were previously unattainable. Large datasets will also soon be used to predict and prepare for individual illnesses, times of increased demands on healthcare services, and more.
45 Big Data and Higher Education Big Data analytics apps help students to succeed, help instructors to know what students still need to learn, analyse efficiency in all areas, boos enrolment, and more. The dashboards enable Big Data analytics and visualization for the purpose of monitoring higher education KPIs such as enrolment, accreditation, effectiveness, research, financial information, and metrics by class and by department CourseConnect, SAP, McGrawHill Connect
46 Big Data and K-12 Education Big Data is revolutionizing K-12 education and facilitates data-driven instruction where teachers use students assessment results to shape planning and instruction and helps teachers to individualize instruction. These apps utilize adaptive learning algorithms, real-time assessment interpretation, predictive modelling, and real-time analytics to bring the power of systematic, real-time data to teachers and students. These apps and platforms provide performance-tracking capabilities, and some of them provide dashboards and other visual displays that teachers can use to drive instruction and understand students strengths as well as the areas in which they need extra support. In addition, students can see and track their own progress with some of these mobile apps, which provides them with immediate, individualized feedback and insights into their own learning. The use of mobile devices & mobile analytics help to make informed decisions and promote reading to discover. LearnSmart from MGrawHill is an intelligent, reliable and effective adaptive learning tools that are available to students.
47 Big Data and IT Operations IT Operations Analytics (ITOA) Log Analysis -- platforms help to identify issues, increase efficiency of application systems and their performance, and to deliver easily understandable insights to IT and management teams. ITOA solutions provide real-time analytics for monitoring applications. These solutions automatically identify and isolate disruptions and failures, and give IT operations teams the opportunity and the tools to solve problems faster than ever. These solutions enable decision makers to analyze large amounts of APM data faster and more intelligently. The AppDynamics Application Intelligence delivers rich performance data, learning, and analytics, combined with the flexibility to adapt to virtually any infrastructure or software environment. It s low-overhead architecture means that you can deploy in production and benefit from real-time data collection and analysis.
48 Big Data and Customer Service Big Data gathered from customer interactions via phone, , chat, social media, or another method, can quickly and effectively provide valuable information about customer needs, behaviour, and preferences. That information can then be utilized to improve every customer s experience with the company and build customer relationships with better engagements of customers. ForeSee Satisfaction Analytics provide a comprehensive customer experience measurement system that gauges performance with contact centers from the customer s perspective. ForeSee Satisfaction Analytics provide continuous, reliable, and precise measurement of the contact center experience based on contact type, agent, call center, region, or other key criteria.
49 Big Data and Security The security of Big Data is essential at every level within an organization. It is vital to have the tools in place to enable you to protect the network, the business, customers, and all of the Big Data that are harnessed for competitive value. The security analytics apps provide analysts with real-time threat intelligence. With this new generation of enterprise security apps, analysts can ask more difficult, complex security questions of their data by utilizing application-layer attributes. These platforms are capable of capturing, processing, protecting, storing, searching, sharing, analysing, and visualizing the Big Data. Teradata, McAfee solutions
50 Big Data and Supply Chain Big Data and cloud-based technologies allow real-time visualization and analysis of data, helping companies to make decisions faster thereby increasing efficiency and effectiveness. Supply chains that are equipped with the ability to sense and respond to demand will become the most profitable. IBM, Teradata Solutions
51 Marketing & Media Mix
53 Big Data in Other Areas Big Data and Recruitment Big Data and Talent Management Big Data and Marketing Big Data and Sales Big Data and Finance
54 Data Scientist Salaries for data scientists is rocketing upward, with the average salary now topping $123,000, as per Indeed.com Some recruiters say that a mere two years of data science experience can translate into a $200,000 to $300,000 per year job. As Mitchell Sanders notes, data science requires a difficult blend of domain knowledge, math and statistics expertise, and code hacking skills. In particular, he suggests that expert knowledge of tools like R and SAS are critical. "If you can't use the tools, you can't analyze the data. more source:
55 8 Skills of Data Scientist source: Accenture
56 Predictions for Business Analysts in 2014 Skill Sets for Business Analysts and Systems Analysts Will Become Interchangeable Agile Will Emerge as a Competency, Not a Methodology The Roles of Business Analyst and Project Manager Will Overlap for SMBs. Early and Often Will Rule the Day Requirements Management Will Grow in Sophistication Career Paths Open Up Project Sponsors Catch the BA Bug Business Analysts Become the Big Man on Campus Enterprise Architecture Comes Back to Life Cloud Based Solutions Ahead
57 Laws of Big Data The faster you analyse your data, the greater its predictive value. Maintain one copy of your data, not dozens. Use more diverse data, not just more data. Data has value far beyond what you originally anticipate. Plan for exponential growth. Solve a real pain point. Put data and humans together to get the most insight. The focus in IT has shifted from Technology to Information.
58 Best Practices For Handling Big Data Make your data policies transparent. Ensure and regularly verify data quality. Guarantee data security Provide data protection. Use the right tools. Define a reasonable period for the retention of data. Employ appropriate tagging implementation, maintenance, and other procedures. Make the value proposition for online behavioral advertising clear and explicit. Market your data and its power to help you reach specific audience segments, but do not exaggerate. Place non-proprietary data in the public domain.
59 Classic Big Data Mistakes Focusing on Technologies Instead of Business Not Knowing What You're Looking For Disregarding Context Dismissing Bias Short-changing Data Quality Not Securing Data Sponsorship Not Executing a Cost-Benefit Analysis Dwelling on What Already Happened
60 Things you shouldn't expect big data to do Solve your business problems Help your data management Ease your security worries Address critical IT skill areas Diminish the value of legacy systems Simplify your data center Improve your data quality Validate current ROI metrics Create less "noise Work every time
61 16 top Big Data analytics platforms
62 16 NoSQL, NewSQL Databases
63 BiDA Master's Degrees: 20 Top Pgms
64 Big Data Courses Big Data University Courses Big Data fundamentals Course Details
65 Big Data Trends Role of superfast WiFi: Advancements in WiFi technology will help spur big data growth. Huawei last month announced it had successfully tested 10-Gbps WiFi service in laboratory trials at its Shenzhen, China campus. Potential enterprise uses include WiFibased location analytics to improve business intelligence, customer engagement, and security operations. Big Data & Deep Data: Big Data is used to manipulate our behaviour linked with commercials where as Deep Data is used to make people and communities to see themselves for transformative change. Big Data & Open Data: Open data is the accessible public data (big) that people, companies, and organisations can use to launch new ventures, analyse patterns and trends, make data-driven decisions, and solve complex problems. Open data should also be relatively easy to use, although there are gradations of "openness". And there's general agreement that open data should be available free of charge or at minimal cost. Growth of Digital Universe: estimated to grow from 3.2 zettabytes (one billion TB) today to 40 zettabytes in only six years. The data volume in the enterprise is going to grow 50x year-over-year between now and About 85% of that data will be coming from net-new data sources including mobile, social media, and web-and machine-generated data, which will present both a challenge and an opportunity for enterprises.. Big Data As A Service: mainly for SMEs
CGMA REPORT From insight to impact Unlocking opportunities in big data Two of the world s most prestigious accounting bodies, AICPA and CIMA, have formed a joint venture to establish the Chartered Global
For Big Data Analytics There s No Such Thing as Too Big The Compelling Economics and Technology of Big Data Computing March 2012 By: 4syth.com Emerging big data thought leaders Forsyth Communications 2012.
Global Headquarters: 5 Speen Street Framingham, MA 01701 USA P.508.872.8200 F.508.935.4015 www.idc.com W H I T E P A P E R B i g D a t a : W h a t I t I s a n d W h y Y o u S h o u l d C a r e Sponsored
Convergence of Social, Mobile and Cloud: 7 Steps to Ensure Success June, 2013 Contents Executive Overview...4 Business Innovation & Transformation...5 Roadmap for Social, Mobile and Cloud Solutions...7
CHAPTER9 BUSINESS INTELLIGENCE THE VALUE OF DATA MINING Data mining tools are very good for classification purposes, for trying to understand why one group of people is different from another. What makes
1 Contents Introduction. 1 View Point Phil Shelley, CTO, Sears Holdings Making it Real Industry Use Cases Retail Extreme Personalization. 6 Airlines Smart Pricing. 9 Auto Warranty and Insurance Efficiency.
At the Big Data Crossroads: turning towards a smarter travel experience Thomas H. Davenport Visiting Professor at Harvard Business School Amadeus IT Group is committed to minimizing its carbon footprint.
Cover Page DEMYSTIFYING BIG DATA A Practical Guide To Transforming The Business of Government Prepared by TechAmerica Foundation s Federal Big Data Commission 1 TechAmerica Foundation: Federal Big Data
fs viewpoint www.pwc.com/fsi 02 15 19 21 27 31 Point of view A deeper dive Competitive intelligence A framework for response How PwC can help Appendix Where have you been all my life? How the financial
Technology that matters Harnessing the technology wave in banking Using new technology to reshape your bank for the future Up to two thirds of the profitability uplift required to be a high performer of
SAP BusinessObjects Business Intelligence SAP BusinessObjects Business Intelligence 4.0 Solutions Empowering the Real-Time, Mobile, Social, and Global Enterprise SAP BusinessObjects Business Intelligence
Retail Banking Business Review Industry Trends and Case Studies U.S. Bank Scotiabank Pershing LLC Saudi Credit Bureau Major International Bank Information Builders has been helping customers to transform
Business innovation and IT trends If you just follow, you will never lead Contents Executive summary 4 Background: Innovation and the CIO agenda 5 Cohesion and connection between technology trends 6 About
SAP Solutions for Analytics Big Data Analytics Guide Better technology, more insight for the next generation of business applications Big Data Analytics Guide 2012 Big Data Analytics Guide 2012 Big Data
Notes: - All dollars in this publication denote U.S. dollars unless otherwise stipulated. - Travel manager and travel buyer are used interchangeably to refer to any manager from any department responsible
The Industrial Internet@Work Marco Annunziata & Peter C. Evans Table of Contents Executive Summary The Industrial Internet Towards No Unplanned Downtime 3 Introduction A New Information and Collaboration
April 2013 Operational Intelligence: What It Is and Why You Need It Now Sponsored by Splunk Contents Introduction 1 What Is Operational Intelligence? 1 Trends Driving the Need for Operational Intelligence
BIG DATA IN LOGISTICS A DHL perspective on how to move beyond the hype December 2013 Powered by Solutions & Innovation: Trend Research PUBLISHER DHL Customer Solutions & Innovation Represented by Martin
2014 www.tmforum.org GEO- $245 USD / free to TM Forum members ANALYTICS QUICK INSIGHTS ADDING VALUE TO BIG DATA Sponsored by: Report prepared for Kathleen Mitchell of TM Forum. No unauthorised sharing.
Industry Agenda Industrial Internet of Things: Unleashing the Potential of Connected Products and Services In collaboration with Accenture January 2015 Contents Executive summary 3 General findings 7 2.1
Best Practices for Cloud-Based Information Governance Autonomy White Paper Index Introduction 1 Evaluating Cloud Deployment 1 Public versus Private Clouds 2 Better Management of Resources 2 Overall Cloud
SOFTWARE ENGINEERING Key Enabler for Innovation NESSI White Paper Networked European Software and Services Initiative July 2014 Executive Summary Economy and industry is experiencing a transformation towards
The Big Data Opportunity Making government faster, smarter and more personal Chris Yiu @PXDigitalGov #bigdata The Big Data Opportunity Making government faster, smarter and more personal Chris Yiu Policy
CHAPTER 1.8 The Wisdom of the Cloud: Hyperconnectivity, Big Data, and Real-Time Analytics MIKAEL HAGSTRÖM NEENA GILL SAS In a hyperconnected world, transactions and communication do not happen in a vacuum.