Big Data Perspectives for Germany Seize the Opportunity Prof. Dr. Stefan Wrobel Fraunhofer-Institut für Intelligente Analyseund Informationssysteme IAIS Fraunhofer Big Data Initiative www.iais.fraunhofer.de bigdata.fraunhofer.de Prof. Dr. Stefan Wrobel
Fraunhofer IAIS: Intelligent Analysis and Information Systems Do more with data From sensor data to business intelligence, from media analysis to visual information systems: our technology allows enterprises to do more with data. 200+ employees, at the campus Birlinghoven castle close to Bonn Research areas Machine Learning and Data Mining Multimedia Pattern Recognition Visual Analytics Process Intelligence Autonomous Systems Prof. Dr. Stefan Wrobel 2
Fraunhofer From innovation to market Big Data Infrastructure Visual Analytics Machine Learning Basic research Core compentences Customers
Big Data everywhere
Big Data Trends Convergence Ubiquitous Intelligent Systems www. User Content Open Data
Zetta, Zebi, Yotta and Yobi [Wikipedia, 2011] Prof. Dr. Stefan Wrobel 6
Stored data at US enterprises For example 1.5 billion new entries at Tesco per month 2.5 petabytes Data Warehouse at Walmart [McKinsey, 2011] Prof. Dr. Stefan Wrobel 7
Open Data Examples of publicly available data sources 6 billion web pages 200 TB of genomic daten 86 billion ngrams 400 million More than 41 4.1 270 data catalogues facts million English articles
Big Data The view of BITKOM, The German IT Association Volume Number of records and files Yottabytes Zettabytes Exabytes Petabytes Terabytes Variety External data (web open data, etc.) Company data Unstructured, semistructured, structured data Presentations text video images tweets blogs Machine to machine communication Big Data High speed data generation Constant transmission of generated Data in realtime Milliseconds Seconds minutes hours Velocity Discovery of relationships, patterns, meaning Prediction models Data Mining Text Mining Image Analytics Visualization Realtime Analytics Quelle: BITKOM Big Data Leitfaden, 2012. BITKOM AK Big Data Prof. Dr. Stefan Wrobel 9
Big Data A definition attempt Big Data in general refers to The trend towards availabity of ever more detail than ever closer to realtime data The switch from a model-driven to a model- and data-driven approach The economic potentials that result from the analysis and use of big data when properly integrated into company processes Big Data currently focuses technically on the following aspects Volume, Variety, Velocity In-memory computing, Hadoop etc. Real-time analysis and effects of scale Big Data must take implications to society into account Prof. Dr. Stefan Wrobel 10
Quellen: http://m.sybase.com/detail?id=1095954 und McKinsey Studie, 2011
Innovation study Big Data Detailed overview of the national and international Big Data landscape Desk research (current state t of affairs) More than 50 systematic Big Data Business Cases In-depth workshops for industry sectors (qualitative study) Expert workshops Finance, Telecom, Market research, E- Comm., Insurance Online survey (quantitative study) 1.10.2012 to 30.11.2012 82 high-ranking executives from small and large companies 12 Prof. Dr. Stefan Wrobel
Sector workshops Big Data Finance Telko Insurance Market research E-Commerce 13 Prof. Dr. Stefan Wrobel
Characteristic areas of companies for Big Data applications according to sector 14
Most frequent goals: Increased revenue and cost-savings 15
Per sector view of tasks for Big Data applications 16
Realtime or non-realtime and automated versus non- automated t analysis 17
Overall view 69% of all respondents are striving to gain strategic advantages from Big Data. 78% answer thatt they need to improve human resources for Big Data. 67% of respondents say that the budget for Big Data topics (technologies, analyses, data sources excluding personnel) must increase. Only 8% of respondents say that there are no barriers towards Big Data success. These results hold cross all sectors 18
Insight from qualitative per sector workshops More efficiency from intelligent information systems Mass individualization of products and services Intelligent products adapt while in use Prof. Dr. Stefan Wrobel 19
Big Data in sales forecasting More efficiency from intelligent information systems Idea: Predict sales at the article level more precisely Big Data: more than 100 million records per week added to the system Benefits: higher availability and more economically efficient http://www.blue-yonder.com Prof. Dr. Stefan Wrobel 20
Suppliers and technologies in the context of Big Data (Selection)
Challenges for realization Respondents see the main problems in the following areas Data security and privacy (49%) Budget and priorities iti (45%) Technical challenges of data management (38%) Expertise (36%) Insufficient knowledge about Big Data possibilities (35%). To change the current deficits, 95% of respondents are looking for Best Practices, Trainings, supplier and solutions surveys and improved privacy regulations 22
Fraunhofer initiative Big Data Joint competences in a»big Data Factory«for Germany Strategies, Solutions and Successes 20 Fraunhofer institutes one central coordination point Synchronized and broad competence portfolio with many years of expertise in big data in different sectors Best of class Big Data solutions for individual projects, consulting and qualification of personnel Fraunhofer initiative Big Data Benefit from the future today! bigdata.fraunhofer.de
Big Data realized by Fraunhofer Visual Analytics Reliable supplier Fraud recognition Efficient for more security chains in finance data production
Visual Analytics for enhanced security React faster Visual Analytics systems support decision makers live in the process of evaluating, understanding and acting on security risks in distributed infrastructures Fraunhofer solutions increase the security and stability of critical infrastructures such as power or communication networks Leading suppliers and operators manage, monitor and optimize their networks with Visual Analytics applications
Reliable supply chains Control logistic processes while they run Sensor-based information systems deliver realtime situation assessments and recognize disturbances in the supply chain in a productive manner Fraunhofer assistance systems protect from unexpected supply problems and increase resource efficiency The info broker software ensures the success of all companies in the supply chain from the original i supplier all the way to the manufacture
Fraud recognition in finance Recognize fraudsters in realtime Big Data algorithms recognize fraudulent credit card transactions in milliseconds Fraunhofer software protects credit card companies and their customers The Software is in day-to-day d used at a leading European payment transaction company and protect so portfolio of several million of credit cards
Increased production efficiency Optimize production with a push of a button Big-Data information systems condense millions of individual id messages to smart indicators Fraunhofer software protects against standstills, increases efficiency and ensures the quality of production Our manufacturing intelligence system is in use at an international automobil company Scalable technologie for the Internet of things
Fraunhofer Living Lab Big Data A Core Architecture t for Scalable and Real-Time Analytics and basis for our training course Data Scientist Big Data Batch-Anwendung Analyse von Kundenfeedback Realtime-Anwendung Big Data Forschungsmonitor 5 Milliarden Webseiten (Q1/2012) ~ 20TB nur Text Ausgewählte Technologien Anwendungsfälle Big Data Datensatz
What s happening on the internet? Consumers get networked in ways never seen before The number of postings about products and brands grows overproportionally Soon more than 6 billion consumers will use mobile devices at the Point of Sale to read things from the internet and use that for their purchase decisions ITU International Telecommunications Union
Recognize important customer feedback among millions of postings and web pages Big Data Process chain Collection of requirements, Collection of data, data pool Customization and operation of the Data validation, specification system, running system consulting, running service -> Permanent flow of relevant information i Online EmotionsRadar
Mobility Mining for outdoor advertising Use of mobility data to predict effectiveness of media advertising Question How many people pass a given poster board at any given day? What is the distribution between public transport, cars and pedestrians? What special about the model? First model for 69 6,9 million street segments in Germany Central element in Germany for determining reach of outdoor advertising Basis for all traffic-related questions in market research
Information source mobility data Mobility Mining helps understand cell phone data Quality of cell phone data high coverage of the population no cost-intensive data collection Fraunhofer allow a view of spatial and temporal Munich: Indication of cell load from GSM data dynamics of mobility at different levels can be processed in realtime Our research expertise: Cell phone data are indicators for mobility 2005 GeoPKDD EU - FET 2010 MODAP EU - CA 2011 LIFT EU FET 2011 DATASIM EU FET
Cell phone example Allianz Arena 29&30.7.9: Audi Cup Champions League VS Juventus Turin DFB-Pokal VS Eintracht Frankfurt Champions League VS Lyon Champions League Champions League VS Manchester VS AC Florenz Länderspiel Deutschland VS Argentinien Bundesliga Heimspiele FC Bayern München One value per hour
Our approach: Integration of heterogeneous data sources Frequency Map GPS Dynamic Mobility Model Cellphone data GSM Household database Geodata Interviews (CATI)
Privacy-preserving Data Mining Reconciles Data Mining and data privacy Legal questions and public opinion Also: Protection of company interests in distributed Data Mining Privacy by Design Development of privacy compatible analytics Guaranteed anonymity, guaranteed results Project examples Data Mining in Fraud detection for - Banco Bilbao Vizcaya Argentaria (BBVA) - Arvato Infoscore LIFT Safe Zone Technologie
Big Data Big Opportunities Data are a resource that will be decisive in competition Big Data technologies allow the intelligent analysis and linking of big and heterogeneous data in realtime With the right approach Big Data and privacy are no contradiction New perspectives for better products, more efficient production and resource-effective action Companies become Data-driven Enterprises The challenge: Technologies and Business Know-how must be integrated in business and production processes in order to create value Prof. Dr. Stefan Wrobel 37