GINeVRA Digital Research Hub Customized Report- Big Data 1 2014. All Rights Reserved.
Agenda Context Challenges and opportunities Solutions Market Case studies Recommendations 2 2014. All Rights Reserved.
Constantly evolving information technologies and the growing use of the Internet in every aspect of people s lives determined the creation of huge amounts of diverse and rapidly flowing data Context: Characteristics of Big Data Information generated and storage space Total available storage space (Exabytes) Digital information generated (Exabytes) If it is true that the more information one has, the better decisions are made, it is easy to understand the potential value Big Data can generate for the user. Big Data have three main features: High volumes: of data: measured in Zettabytes, equivalent to 1000 7 Bytes; High Variety: data are not equally structured, on the contrary, data with different origins and formats are stored in large Database; High Velocity: it characterizes data both with regards to their origin (capture) and their usage, since rapidity of analysis is essential in order to create value. *SOURCE: BTO elaboration of primary and secondary sources 3 2014. All Rights Reserved.
Agenda Context Challenges and opportunities Solutions Market Case studies Recommendations 4 2014. All Rights Reserved.
The possible applications of Big data are extremely diversified and act upon the whole company s value creation chain, both in operations and in supporting functions Challenges and opportunities The use of Big data allows the creation of value along the whole «Porterian» value creation chain: the following examples explain how this is possible: VALUE CHAIN Company infrastructure Automation, identification of malfunctions and energy savings. HR management Improved monitoring capabilities. R&D More data and more complex analysis allow an improved development of new products, closer to the needs of customers. Supplies The improved awareness of inventories in real-time allows a better management of supplies. Inventory management It consists in integrating of systems for the management of automated warehouses in order to optimize supplies and decrease costs. Operations Allow: Feedback in product innovation Information transparency Real-time performance monitoring. Marketing and sales Enable: Algorythims for the definition of prices Advanced promotional campaigns Market segmentation. Margin Fraud identification capabilities; monitoring of the abandon rate and underlying reasons, an improved risk evaluation 5 2014. All Rights Reserved.
Agenda Context Challenges and opportunities Solutions Market Case studies Recommendations 6 2014. All Rights Reserved.
The characteristics of Big data in terms of quality, speed and variety, make it difficult to manage with relational databases. NoSQL technologies are instead able to ensure speed of analysis and scalability Solutions The path most frequently chosen is the use of NoSQL solutions, a database type developed precisely in order to allow an easy scalability regardless of the quantity and variety. NoSQL Unlike conventional technology, SQL, the structure of the database is not necessarily made of rows and columns, but it has the possibility of being structured in variable patterns; Data is partitioned and balanced between the different nodes of the cluster, and the aggregated queries are distributed by default; Not only the structure but also the query language is different: compared to SQL this concept is defined as MapReduce. ADVANTAGES: This approach gives the possibility of having streams of data that are made available in real time; It makes it easy to change the way data is stored and modify real-time queries that are used; Migrations to operate massive changes are no longer need. 7 2014. All Rights Reserved.
Agenda Context Challenges and opportunities Solutions Market Case studies Recommendations 8 2014. All Rights Reserved.
The market for Big data is characterized by the dominance of the large IT companies and a growing number of startups created to meet specific needs Market Listed below are the market leader in terms of turnover, major IT vendors that are not only focused on Big Data, but that offer a portfolio of solutions to cover every stage of the process: IBM: its main products are InfoSphere BigInsights and Infosphere Streams for the analysis of data flows HP: offering diverse solutions aimed at every infrastructural and information management need. ORACLE: having a large portfolio of infrastructure, data management and advanced analysis products. Another category to consider is that of Pure Players in Big Data, companies that specialize solely in this area. Among these are fast growing companies such as: 10GEN: offers solutions based on the NOSQL MangoDB database; OPERA SOLUTIONS: specialized in complex predictive analysis of data; SPLUNK: offers services and software based on Apache Hadoop; 1010DATA: intelligence systems strong in Data Visualization; CLOUDERA: data analysis and elaboration in the Cloud; ACTIAN: offerse Big data management software, analytic engines and databases. 9 2014. All Rights Reserved.
Agenda Context Challenges and opportunities Solutions Market Case studies Recommendations 10 2014. All Rights Reserved.
The analysis of business data has enabled GE to get an overview of the suspicious behavior identifying patterns of fraudulent behavior COMPANY: Case studies General Electric Consumer & Home Services is a division of GE Group that among the various tasks, also manages the refunds to technicians for the resolution of technical problems at GE's consumer products such as washing machines, dryers, refrigerators and others, that are still under warranty; PROBLEM: The division manages approximately 1 million files per year, many of which are frauds operated by technicians who obtain reimbursement for jobs that actually haven t been carried out. Manual processing, however, could not identify these frauds because it could not give an overview of the issue and identifying the specific behaviors that may be indicative of such scams ; SOLUTION: GE has therefore processed, with the help of a provider, all the old files in order to identify what were the patterns that indicated a fraud and then implemented these into an algorithm in the system of refund files management. Today, the system automatically identifies those files that indicate probable fraudulent behavior. RESULT: GE has estimated that, following the implementation of this system, savings in the first year were approximately 5.1m$. 11 2014. All Rights Reserved.
Agenda Context Challenges and opportunities Solutions Market Case studies Recommendations 12 2014. All Rights Reserved.
The profitable management of Big Data is based not only on the choice of appropriate technology, but must also be followed by the development of enabling skills Recommendations Data Scientist IT COMPETENCIES FUNCTIONAL COMPETENCIES ANALYTIC COMPETENCIES The implementation of a system of exploitation of Big Data, is an evolutionary process across the company the success of which is heavily dependent on corporate organization. The path must be gradual and shared and, in particular, there are certain best practices that increase the chances of success of such projects : Creating a project team with people who have balanced analytical skills, functional and IT: "Data Scientist "; Start working on small projects on areas that are of most value to the business. In most cases, so initiatives that involve the analysis of customer needs to make predictions, it is often useful to start analyzing the Datawarehouse already present in the company, which is well known, before searching for new data. By doing so experience is gained and immediately results will we seen through small wins; Before starting, a document explaining how the company intends to leverage big data to achieve of business objectives (scope, requirements, architectures needed) is to be produced. This document must be widely distributed for it to guide all parties involved in the implementation process. 13 2014. All Rights Reserved.
14 2014. All Rights Reserved.