Big Data Analytics Workshop #1. First International Workshop on Big Data Applications and Principles (BigDAP 2014)

Similar documents
Concept and Project Objectives

Politecnico di Torino. Porto Institutional Repository

3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC-2016) March 10-11, 2016 VIT University, Chennai, India

D3.3.1: Sematic tagging and open data publication tools

Qualification of innovative floating substructures for 10MW wind turbines and water depths greater than 50m

Deliverable D8.1 Water Reuse Europe (WRE) website design and functionality specification

Azure Multi-Factor Authentication. KEMP LoadMaster and Azure Multi- Factor Authentication. Technical Note

Technology Partner Program

POLITECNICO DI MILANO SCHOOL OF MANAGEMENT

1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India

Transforming the Telecoms Business using Big Data and Analytics

DOCTORAL PROGRAMME IN MANAGEMENT ENGINEERING

FITMAN Future Internet Enablers for the Sensing Enterprise: A FIWARE Approach & Industrial Trialing

D3.4.1: Data Fusion Tools

An Oracle White Paper November Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

Big Data and Natural Language: Extracting Insight From Text

UNIVERSITY OF INFINITE AMBITIONS. MASTER OF SCIENCE COMPUTER SCIENCE DATA SCIENCE AND SMART SERVICES

Project Execution Guidelines for SESAR 2020 Exploratory Research

D1.3 Industry Advisory Board

The Process Below are the steps for creating and presenting digital short courses:

Log Insight Manager. Deployment Guide

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

Sensing, monitoring and actuating on the UNderwater world through a federated Research InfraStructure Extending the Future Internet SUNRISE

The Role of Information Technology Studies in Software Product Quality Improvement

Introduction to Data Mining

INERTIA ETHICS MANUAL

IRMOS Newsletter. Issue N 4 / September Editorial. In this issue... Dear Reader, Editorial p.1

Big Data Analytics. An Introduction. Oliver Fuchsberger University of Paderborn 2014

Big Data and Analytics: Challenges and Opportunities

Sentiment Analysis on Big Data

RSA Two Factor Authentication

GEO Sticky DNS. GEO Sticky DNS. Feature Description

COMP9321 Web Application Engineering

CONNECTING DATA WITH BUSINESS

Giving life to today s media distribution services

D9.1 Project Website

USERS SHOULD READ THE FOLLOWING TERMS CAREFULLY BEFORE CONSULTING OR USING THIS WEBSITE.

Protecting Data with a Unified Platform

CUSTOMER Presentation of SAP Predictive Analytics

System Center Virtual Machine Manager 2012 R2 Plug-In. Feature Description

TECHNOLOGY TRANSFER PRESENTS MARK BUSINESS INTELLIGENCE ESTENDING BI TO SUPPORT ONLINE MARKETING AND CUSTOMER ANALYSIS

How To Make Sense Of Data With Altilia

Deliverable D 6.1 Website

Installation Guide Supplement

Beyond listening Driving better decisions with business intelligence from social sources

ACEDS Membership Benefits Training, Resources and Networking for the E-Discovery Community

Information Management course

International Workshop on Big Data Analytics for Advanced Databases (BIGDATA, 2016)

Azure Machine Learning, SQL Data Mining and R

5 Ways to Get Top Mobile App Developer Talent for Your Open APIs

ANZ TRANSACTIVE MOBILE for ipad

Figure 1 Cloud Computing. 1.What is Cloud: Clouds are of specific commercial interest not just on the acquiring tendency to outsource IT

Big Data Are You Ready? Jorge Plascencia Solution Architect Manager

How To Organize A European Health Conference

Deliverable 7.1 Web Site and Promotional Materials

Formal Methods for Preserving Privacy for Big Data Extraction Software

See the wood for the trees

IBM Content Analytics with Enterprise Search, Version 3.0

THE EVENT THE TOPICS. 6 th BETA CAE International Conference June 2015

LDAP Synchronization Agent Configuration Guide for

Developing the SMEs Innovative Capacity Using a Big Data Approach

EFFECTS+ Clustering of Trust and Security Research Projects, Identifying Results, Impact and Future Research Roadmap Topics

Role of Social Networking in Marketing using Data Mining

Deliverable D7.2: The project website

BIG DATA PUBLIC PRIVATE FORUM

Context Aware Predictive Analytics: Motivation, Potential, Challenges

Third Party Software Used In PLEK500 (Utility for Win) v1.x.xx.xxx

Creating an IoT Ecosystem through scenarios

Guide for Applicants COSME calls for proposals 2015

can you effectively plan for the migration and management of systems and applications on Vblock Platforms?

Best Practices for Log File Management (Compliance, Security, Troubleshooting)

Microsoft SharePoint

Collaborative Open Market to Place Objects at your Service

What do Big Data & HAVEn mean? Robert Lejnert HP Autonomy

Hadoop Technology for Flow Analysis of the Internet Traffic

CAMPAIGN 2015/2016: GUIDELINES & FAQ

ADAPTIVE AUTHENTICATION ADAPTER FOR JUNIPER SSL VPNS. Adaptive Authentication in Juniper SSL VPN Environments. Solution Brief

INCREASE NETWORK VISIBILITY AND REDUCE SECURITY THREATS WITH IMC FLOW ANALYSIS TOOLS

BIG DATA IN BUSINESS. Implement and use Big Data to your organization s advantage

Is a Data Scientist the New Quant? Stuart Kozola MathWorks

Deliverable 1.2 Project Presentation

Dell Enterprise Reporter 2.5. Configuration Manager User Guide

IEEE International Conference on Computing, Analytics and Security Trends CAST-2016 (19 21 December, 2016) Call for Paper

Rules for ISE Annual Meetings

AAUW Site-Resources Website Services Agreement. Contact Information. Website Information

KEY KNOWLEDGE MANAGEMENT TECHNOLOGIES IN THE INTELLIGENCE ENTERPRISE

SMART InTeRneT OF ThIngS

A financial software company

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Port Following. Port Following. Feature Description

BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research &

A Guide to Horizon 2020 Funding for the Creative Industries

Some Research Challenges for Big Data Analytics of Intelligent Security

High Quality Video-conference Service (HQVS) scope definition

All copyright, trade mark, design rights, patent and other intellectual property rights (registered or unregistered) in the Content belongs to us.

CALL FOR PARTICIPATION. The 14th biennial Participatory Design Conference (PDC) PARTICIPATORY DESIGN IN AN ERA OF PARTICIPATION"

IBM QRadar Security Intelligence Platform appliances

Guidelines for applicants

The 4 Pillars of Technosoft s Big Data Practice

Massive Cloud Auditing using Data Mining on Hadoop

Transcription:

!! Big Data Analytics Workshop #1 First International Workshop on Big Data Applications and Principles (BigDAP 2014) ONTIC Project (GA number 619633) Deliverable D6.8 Dissemination Level: PUBLIC Authors Alberto Mozo UPM Sandra Gómez UPM Bruno Ordozgoiti UPM Enrique Fernández UPM Version ONTIC_D6.8.2014.10.24.1.3

619633 ONTIC. D6.8 Version History Previous version Modification Modified Summary date by Draft 2014-10-24 UPM First Version 1.0 2014-11-05 UPM Added reviewer changes 1.1 2014-11-06 UPM Minor grammatic al changes 1.2 2014-11-14 UPM Added Annex A.2 1.3 2014-11-27 POLITO Typos and minor changes Quality Assurance: Quality Assurance Manager Reviewer #1 Reviewer #2 Name Elena Baralis (POLITO) Alejandro Bascuñana (Ericsson) Fernando Arias (EMC) 2 / 40

Copyright 2014, ONTIC Consortium The ONTIC Consortium (http://www. http://ict- ontic.eu/ ) grants third parties the right to use and distribute all or parts of this document, provided that the ONTIC project and the document are properly referenced. THIS DOCUMENT IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS DOCUMENT, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE 3 / 40

Table of Contents 1. ACRONYMS AND DEFINITIONS 8 2. EXECUTIVE SUMMARY 9 3. BIGDAP DEFINITION 10 3.1 Creation... 10 3.2 Visibility and image... 10 3.3 Goals... 10 3.4 BigDAP Audience... 11 4. WORKSHOP STRUCTURE 12 4.1 Sessions structure... 12 4.2 Scope and topics... 12 5. WORKSHOP ORGANIZATION 14 5.1 Program Committee... 14 5.2 Organizing Committee... 15 6. WORKSHOP PROGRAM 16 6.1 Definition of activities related to the workshop program... 16 6.1.1 Call for papers:... 16 6.1.2 Paper submission and acceptation criteria... 16 6.1.3 Submission process platform... 17 6.2 Program structure definition... 17 6.2.1 Academic track:... 17 6.2.2 Industrial Track:... 18 6.3 Abstracts for accepted papers and invited talks:... 19 7. WORKSHOP WEB 25 7.1 Workshop website as part of the ONTIC project portal... 25 7.2 Web structure... 26 8. DISSEMINATION AND ADVERTISING OF THE WORKSHOP 29 8.1 Email distribution lists:... 29 8.2 Visibility at the Universidad Politécnica de Madrid website... 29 8.3 Workshop promotion on audiovisual media... 31 4 / 40

9. WORKSHOP DEVELOPMENT 32 9.1 Registration Process... 32 9.2 Receipt of grant application for students... 32 9.3 Attendees... 33 10. RESULTS 34 11. FUTURE PLANS 36 ANNEX A A.1 Caller structure... 37 A.2 Proceedings... 40 37 5 / 40

List of figures Figure 1: BigDAP logo... 10 Figure 2: Access from ONTIC portal to BigDAP web... 25 Figure 3: BigDAP web page... 26 Figure 4: Menu web BigDAP... 27 Figure 5: ETSISI web divulgation of BigDAP 2014... 30 Figure 6: BigDAP 14 poster... 31 Figure 7: Event pictures... 34 6 / 40

List of tables Table 1: Acronyms... 8 Table 2: Important Dates... 16 Table 3: BigDAP 2014 program... 19 Table 4: Fees and types of registration... 32 Table 5: Attendees statistics... 33 7 / 40

1. Acronyms and Definitions Acronyms Acronym BigDAP ETSISI Table 1: Acronyms Defined as Big Data Applications and Principles Escuela Técnica Superior de Ingeniería de Sistemas Informáticos 8 / 40

2. Executive Summary The main aim of Work Package 6 (Exploitation and Dissemination) is to ensure the maximum awareness and visibility of project results. In particular, this work package will promote the dissemination and adoption of the ONTIC outcomes to other application domains such as bioinformatics, genomic, medicine, physics social sciences and finances. In this context, the First International Workshop on Big Data Applications and Principles (BigDAP 2014) was conceived with the intention of providing a platform for researchers and developers, both from academia and industry, to share ideas, knowledge and information among them about the current approaches and technologies as well as the latest developments regarding Big Data. BigDAP 2014 is hosted and sponsored by the FP7 ONTIC project (Online Network TraffIc Characterization, http://ict-ontic.eu) that is funded by the European Commission under the Seventh Framework Programme. Therefore, the application of scalable Big Data analytics to network traffic characterization will be a key topic in this workshop. BigDAP 2014 workshop took place at the E.T.S. Ingeniería de Sistemas Informáticos (Universidad Politécnica de Madrid) Madrid, Spain, from Thursday, September 11 to Friday, September 12, 2014. 9 / 40

3. BigDAP Definition 3.1 Creation BigDAP was an international workshop conceived to allow for the dissemination of the activities carried out in the ONTIC project, as well as the achieved results. In addition, BigDAP was organised in order to increase public awareness of the project, to encourage new research activities in its areas of interest and to provide a networking space for academic institutions and members of the industry. In this regard, BigDAP was a perfect opportunity for the exchange of ideas, information, experiences and contacts which certainly fueled the advancement of the ONTIC project in the academic, industrial and scientific areas. In order to achieve this, BigDAP was structured in two blocks: the scientific/academic track and the industrial track. By means of this, a valuable combination of knowledge was conveyed to both the audience and the participants. 3.2 Visibility and image In order to attain a good level of visibility, a name and a logo were carefully conceived. BigDAP is a combination of the terms Big Data and Dap, which aims to represent the significant advances that often stem from research in this area in recent years. Figure 1: BigDAP logo The logo is meant to get the spirit of the workshop. The kangaroo represents the great leap forward that, as previously mentioned, comes from research in the field of Big Data. 3.3 Goals One of the main goals of BigDAP was the dissemination of the activities and results of the ONTIC project. 10 / 40

Other specific goals were: To provide a networking space for academic institutions and members of the industry. To enhance the participation of graduate students, especially those enrolled in PhD programs related to Big Data, Big Analytics and Network Science. To introduce undergraduate students to these topics. 3.4 BigDAP Audience The intended audience of the workshop can be grouped as follows: Industrial sector: professionals involved in R&D&I activities. Enterprise sector: representatives of SMEs and large corporations interested in improving their organizational structures leveraging novel technologies. Professors and researchers whose research interests are focused on the activities carried out at ONTIC. PhD students who are working on a thesis related to the activities carried out at ONTIC. Students enrolled in programs related to the activities carried out at ONTIC, or with the intention of developing their career in a related area. In order to compel students to attend the workshop, a number of scholarships covering all the expenses were granted. These scholarships were awarded to both undergraduate and graduate students. 11 / 40

4. Workshop Structure BigDAP 2014 is a workshop aimed at promoting and displaying excellent research and innovation on Big Data. This workshop has provided a platform for researchers and developers, both from education and industry, to share ideas, knowledge and information among them with the purpose of enriching the current Big Data ecosystem. BigDAP 2014 consisted of a series of invited talks, one industrial track and peerreviewed scientific contributions. Significant room was reserved to PhD students in BigDAP 2014, to allow them to share ideas and know-how related to Big Data topics. BigDAP 2014 encourages high-quality research in all branches of Big Data application. Their broad scope provides an opportunity to bring together researchers and industrial companies motivated by the exchange of theoretical, practical and experimental approaches in real implemented use cases. 4.1 Sessions structure In this first realization of the event, two main tracks were selected in order to gather both perspectives industrial and academics, namely: Academic track: this track consisted of a series of invited talks and peerreviewed scientific contributions. The invited talks were delivered by worldclass academic and scientific experts who work on fields related to the topics of the workshop. All of the scientific contributions have been papers sent to the workshop via the submission process, which were formally peerreviewed by the Programme Committee. Industrial track: this track featured invited talks by representatives of major corporations with an R&D department focused on areas related to the topics of the workshop. Their contributions helped understand the implications of real-world applications of these techniques and how they are trying to adopt and incorporate new developments programs in their R&D departments to be ready to the challenge of Big Data. 4.2 Scope and topics Topics of either theoretical or applied interest included, but were not limited to: Techniques, models and algorithms for Big data Scalable Data Mining and Machine learning techniques and mechanisms Big Data frameworks and architectures NoSQL, NewSQL and Graph databases Verification, Validation and Testing Big Data applications 12 / 40

Big Data and analytics in Telecommunication, Social Media, Bioinformatics, health care, medicine, finance, business, law, education, transportation, science, engineering, ecosystem, etc. Multimedia and unstructured data management for Big Data Parallel, distributed computing and virtualization for Big Data Hardware/software infrastructure for Big Data Big Data Security and Privacy challenges Cleaning Big Data (noise reduction), acquisition & integration Grid and stream computing for Big Data Programming models and environments to support Big Data Multidimensional Big Data Algorithms for enhancing data quality 13 / 40

5. Workshop organization Two members of the ONTIC project were in charge of the general coordination activities. Since the workshop was held at the ETSISI-UPM, both the chair (professor Alberto Mozo) and the co-chair (professor Antonio Hernando) were members of the UPM team. Alberto Mozo: Universidad Politécnica de Madrid, Spain. Coordinator of the ONTIC Project. Antonio Hernando: Universidad Politécnica de Madrid, Spain. UPM member of the ONTIC Project. The management and academic organization activities were divided into two committees. The Programme Committee was in charge of all management and decision-making processes related to academic and scientific aspects of the workshop. The Organizing Committee was in charge mainly of management activities, as well as some academic tasks. A list of the members of each of the committees follows: 5.1 Program Committee Alberto Mozo (Universidad Politécnica de Madrid) Antonio Hernando (Universidad Politécnica de Madrid) Daniele Apiletti (Politécnico di Torino, Italy) Fernando Arias (EMC Spain) Alejandro Bascuñana (Ericsson Spain) Elena Baralis (Politécnico di Torino, Italy) Luca Cagliero (Politécnico di Torino, Italy) Sotiria Chatzi (ADAPTIT, Greece) Luigi Grimaudo (Politécnico di Torino, Italy) Miguel Angel Lopez (SATEC Spain) Miguel Angel Monjas (Ericsson Spain) José María Ocón (SATEC Spain) Philippe Owezarski (Centre National de la Recherche Scientifique, France) Spyros Sakellariou (ADAPTIT, Greece) Bo Zhu (Universidad Politécnica de Madrid) Main responsibilities of the Programme Committee: To define the topics and the main theme of the Workshop 14 / 40

Peer-review of submitted papers To select which papers were to be presented at the Workshop To define the structure and the tracks of the Workshop To invite personalities from the industrial world for their participation in the industrial track To invite personalities from the academic world for them to deliver invited talks or tutorials To define the workshop calendar (paper submission deadlines, acceptance notification deadlines, registration deadlines). Technical review of the proceedings. 5.2 Organizing Committee Sandra Gómez Canaval (Universidad Politécnica de Madrid) Enrique Fernández Cantón (Universidad Politécnica de Madrid) Alex Martínez Bravo (Universidad Politécnica de Madrid) Alvaro Guadamillas Herranz (Universidad Politécnica de Madrid) Nuria Manchado Bustelos (Universidad Politécnica de Madrid) Main responsibilities of the Organizing Committee: Dissemination of the event: callers, distribution lists, dissemination media. Management of the paper submission platform. Notification of evaluation and acceptance to authors. Organization of the event schedule. Management of participant registration, arrival, attendance, credentials and certificates. Production of the proceedings. Development and management of the website. Management of dissemination channels and event contacts. Coffee and lunch-breaks. Scholarships. Support for calendar management, workshop configuration and general logistics. An e-mail account was created for any doubts or information requests. E-mail: BigDAP@ict-ontic.eu 15 / 40

6. Workshop Program The calendar for the event was established by the Programme and Organization Committees, as defined in the following table: July 31, 2014 August 1, 2014: August 10, 2014: August 20, 2014: September 3, 2014: September 8, 2014: Important Dates Deadline for paper submission (closing at 23:59h, CET) Deadline for paper submission (closing at 23:59h, CET) (EXTENDED) Notification of paper acceptance or rejection Early registration Final version of the paper for the proceedings Late registration Table 2: Important Dates 6.1 Definition of activities related to the workshop program 6.1.1 Call for papers: Papers presented research or industrial contributions concerning the topics of the workshop. Papers had to be written in English and had to provide sufficient detail for the program committee to assess their merits. Calls were sent out in the following three phases: Event launch: first call, at the beginning of May, 2014. Call for contributions: one month after the first call. Contribution call reminder: one month before the submission deadline. Last call for contributions: one week before the submission deadline. Extended submission: five days before the extended submission deadline. The contents of the call e-mail are in Annex Caller structure, page 37. 6.1.2 Paper submission and acceptation criteria Submitted papers underwent a peer reviewing process by Programme Committee members. The Programme Committee selected contributions to be presented at the workshop and published in the proceedings according to the criteria of originality, clarity, and completeness. Emergent research works aiming to receive feedback were also accepted and published in a brief announcement format. Additionally, relevant dissemination works, already published, were also accepted with the aim of 16 / 40

generating synergies, collaborations, new challenges and future work among Big Data community members. Double submission was also allowed. Selected papers were presented during the workshop and published as a volume of BigDAP 2014 workshop proceedings. It was required that each accepted paper was presented at the workshop by one of its authors. Submissions had to conform strictly to the LNCS format (instructions at http://www.springer.de/comp/lncs/authors.html), and could not exceed 10 pages, including figures, tables and references. Additional details could be included in a clearly marked appendix, which would be read, or not, at the discretion of the reviewers. LaTeX was the preferred format, but Word templates were also available. The first page had to indicate the title of the paper, author(s) names and affiliation, an abstract, email, type of contribution (regular paper, industrial track, brief announcement or dissemination work) and a list of keywords. 6.1.3 Submission process platform In order to facilitate the management of submitted papers, an EasyChair account was created (http://www.easychair.org/conferences/?conf=bigdap14). This open source platform was used for managing the reception, the review and the evaluation of the papers. It was also used to notify applicants of the acceptance of their submissions. The submission process was thus carried out in a fully electronic fashion. 6.2 Program structure definition Since the workshop addresses its topics from two separate but complementary perspectives (academic and industrial), the program was structured in two main blocks. The first of the two days of BigDAP '14 focused on the academic contributions. The second day was dedicated to the industrial track. 6.2.1 Academic track: This track took place during the first day. It was divided in two sessions, with a lunch break in-between. The first session was devoted to the application of Big Data in different domains (e.g. automotive, health, aeronautics). The second session was entirely focused to telecom domain. Each of these two sessions was itself divided in two parts by a coffee break. These breaks provided a perfect opportunity for networking and information exchange between participants. Invited talks: both sessions featured two sixty-minute-long invited talks. 17 / 40

Accepted papers: of all submitted papers, only those that received a positive evaluation by all reviewers (members of the Programme Committee) were accepted. Some of the evaluation criteria were originality, impact on the topics of the workshop, clarity and completeness. Papers were grouped by topic similarity. It was agreed upon that papers were to be presented in approximately thirty minutes. The acceptance rate was 62.5%. 6.2.2 Industrial Track: This track took place during the second day. Several multinational enterprises worldwide leaders in their business area (BBVA, Ericsson, Telefonica, EMC and Indra) were invited. Therefore, BIGDAP allow showing different Big Data perspectives coming from telecom, banking, and cloud. Among others, the industrial part of the workshop featured a talk by Ajit Jaokar, who runs his own company and teaches courses on Big Data and Internet of Things at Oxford University. Each of these talks had a duration of sixty minutes. The following table contains the final schedule of the workshop. Thursday, September 11th 2014 9:00 9:30 Registration 9:30 9:40 Inaugural Session 9:40 10:30 Invited Talk: Combining Graphs and Big Data to Recommend Apps A. Fernández Anta (IMDEA Networks) Session 1: 10:30 11:00 Electronic Health Records Analytics: Natural Language Processing and Image Annotation. R. Costumero, Á. García-Pedrero, I. Sánchez, C. Gonzalo-Martin and E. Menasalvas 11:00 11:30 Query-driven Ontology for BigData F. Calle, J. Morato, E. Castro, D. Cuadra and E. Albacete 11:30 12:00 Coffee break 12:00 12:30 Invited Talk: Probabilistic models for processing texts Antonio Hernando (Universidad Politecnica de Madrid) 12:30 13:00 Anomaly detection in recordings from in-vehicle networks Andreas Theissler 13:00 13:30 Generating, repairing, and manipulating aeronautic databases J. M. Vega 13:30 14:30 Lunch Session 2: 14:30 15:00 Network traffic analysis by means of Misleading Generalized Itemsets D. Apiletti, E. Baralis, L. Cagliero, T. Cerquitelli, S. Chiusano, P. Garza, L. Grimaudo and F. 18 / 40

Pulvirenti 15:00 15:30 Unsupervised Detection of Network Attacks in the dark P. Owezarski, P. Casas and J. Mazel 15:30 16:00 A Survey on feature selection and reduction for network traffic characterization B. Zhu and A. Mozo 16:00 16:30 Coffee break 16:30 17:00 A Telecom Analytics Framework for Dynamic Quality of Service Management A. Guadamillas, M.A. López, N. Maravitsas, A. Mozo and F. Pulvirenti 17:00 17:30 Adaptive Quality of Experience (AQoE) control for Telecom Networks A. Bascuñana, F. Castro, D. Espadas, P. Sánchez and M.A. Monjas Friday, September 12th 2014 Session 3: 9:30 10:30 Invited Talk: IoT and Machine learning Ajit Jaokar (FutureText & Oxford University) 10:30 11:00 Coffee break 11:00 11:30 SmartSteps and Big Data at Telefonica Jose Luis Agundez (Telefonica S.A.) 11:30 12:00 Capturing Value from Big Financial Data Daniel Villatoro (BBVA Data & Analytics) 12:00 12:30 Business Redefined Alejandro Gimenez (EMC Spain) 12:30 13:00 Businesses based on customers' advanced analyses Mónica León Santamaría (INDRA) 13:00 13:30 Big Data: Opportunities for the Telecom Industry Manuel Lorenzo (Ericsson Spain) 13:30 14:00 Big Data, bigger problems for data protection? Celia Fernandez Aller (Universidad Politecnica de Madrid) 14:00 14:15 Final session Table 3: BigDAP 2014 program 6.3 Abstracts for accepted papers and invited talks: The abstract of every presentation (papers and invited talks) is included below. Electronic Health Records Analytics: Natural Language Processing and Image Annotation R. Costumero, Á. García-Pedrero, I. Sánchez, C. Gonzalo-Martin and E. Menasalvas Big data applications in the Healthcare sector indicate a high potential for improving the overall efficiency and quality of care delivery. Health data analytics highly relies 19 / 40

on the availability of Electronic Health Records (EHRs). The complexity of healthcare information management is not only due to the amount of data generated but also by its diversity and the challenges of extracting knowledge from unstructured data. Solutions have not proposed until now an integrated solution to process, mine and extract knowledge. In this paper we propose an architecture, TIDA, to deal in an integrated way with all the information contained in EHR. TIDA (Text, Image and Data Analytics) makes it possible to address the problem of Spanish text indexing in the healthcare domain by adapting different techniques and technologies, besides components have been included to deal with image segmentation. Query-driven Ontology for BigData F. Calle, J. Morato, E. Castro, D. Cuadra and E. Albacete The proposal presented in this paper is focused on improving the semantics of queries through a linguistic ontology. The data interpretation with their linguistic relationships will make easy the query formulation and also could help to carry out powerful analyses over the BigData. We propose a way to improve queries by natural language tools. Our approach is twofold: On one hand, Ontology is used to include the context of the user. On the other hand, reliability predictors are included in the analysis in order to answer with relevant data. As a result, the proposal brings together linguistic and retrieval technologies to provide an innovative approach to a common concern. Probabilistic models for processing texts A.Hernando For the last decade, researchers have proposed different probabilistic models for processing texts which have been very successful in different applications like clustering documents or information retrieval. In this paper, we will survey some of the most important techniques: Latent Semantic Analysis, Probabilistic Latent Semantic Analysis, Latent Dirichlet Allocation and Gamma-Poisson model. Here, we will highlight the differences and similarities between them. Anomaly detection in recordings from in-vehicle networks A. Theissler In the automotive industry test drives are being conducted during the development of new vehicle models or as a part of quality assurance of series-production vehicles. Modern vehicles have 40 to 80 electronic control units interconnected via the so-called in-vehicle network. The communication on this in-vehicle network is recorded during test drives for the use of fault analysis, which results in big data. This paper proposes to use machine learning to support domain-experts by preventing them from contemplating irrelevant data and rather pointing them to the relevant parts in the recordings. The underlying idea is to learn the normal behaviour from the available multivariate time series and then to autonomously detect unexpected deviations and report them as anomalies. The one-class support vector machine "support vector data description" is enhanced to work on multivariate time series. The approach allows to detect unexpected faults without modelling effort as is shown on recordings from test drives. The proposed methodology is applicable to multivariate time series from other sources, e.g. network traffic or industrial plants. 20 / 40

Generating, repairing, and manipulating aeronautic databases J. M. Vega Industrial databases resulting from experimental campaigns and numerical simulation are of paramount interest in a many engineering fields such as aeronautics, for a variety of tasks, including aeronautic design, certification, control, and operation of flight simulators. These databases contain information on several output variables depending on both the spatio-temporal variables and several parameters, meaning that they are genuinely multidimensional, involving a large/huge number of dimensions (say, in the range, 10-100). They may exhibit a very large size, their generation may involve a very large cost, and they could need to be used in an environment that requires real time operation. In other words, these databases involve two of the three v's (namely, either volume, velocity, or both) that define Big Data. Furthermore, they may exhibit faulty/erroneous/noisy data that need to be repaired. The case of aerodynamic databases resulting from wind tunnel tests and computational fluid dynamics will be used to illustrate both the challenges ahead and some recent developments in the field based on an appropriate combination of deterministic tools such as proper orthogonal decomposition, high order singular value decomposition, fast Fourier transform, and projection of the governing equations onto low dimensional subspaces. Network traffic analysis by means of Misleading Generalized Itemsets D. Apiletti, E. Baralis, L. Cagliero, T. Cerquitelli, S. Chiusano, P. Garza, L. Grimaudo and F. Pulvirenti In the last ten years, with the explosion of the usage of Internet, network traffic analytics and data mining issues have taken primary importance. Generalized itemset mining is an established data mining technique which allows us to discover multiple-level correlations among data equipped with analyst-provided taxonomies. In this work, we address the discovery of a specific type of generalized itemsets, named misleading generalized itemsets (MGIs), which can be used to highlight anomalous situations in potentially large datasets. More specifically, MGIs are highlevel patterns with a contrasting correlation type with respect to those of many of their descendant patterns according to the input taxonomy. This work proposes a new framework, named MGI-Cloud, which is able to efficiently extract misleading generalized itemsets. The framework is characterized by a distributed architecture and it is composed by a set of MapReduce jobs. As reference case study, MGI-Cloud has been applied to real network datasets, captured in different stages from a backbone link of an Italian ISP. The experiments demonstrate the effectiveness of our approach in a real-life scenario. Unsupervised Detection of Network Attacks in the dark P. Owezarski, P. Casas and J. Mazel The unsupervised detection of network attacks represents an extremely challenging goal. Current methods rely on either very specialized signatures of previously seen attacks, or on expensive and difficult to produce labeled traffic data-sets for profiling and training. In this paper we present a completely unsupervised approach to detect attacks, without relying on signatures, labeled traffic, or training. The method uses robust clustering techniques to detect anomalous traffic flows. The structure of the anomaly identified by the clustering algorithms is used to automatically construct specific filtering rules that characterize its nature, providing easy-to-interpret information to the network operator. In addition, these rules are combined to create an anomaly signature, which can be directly exported towards standard security devices like IDSs, IPSs, and/or Firewalls. The clustering algorithms are highly adapted for parallel computation, which permits to perform 21 / 40

the unsupervised detection and construction of signatures in an on-line basis. We evaluate the performance of this new approach to discover and to build signatures for different network attacks without any previous knowledge, using real traffic traces. A Survey on feature selection and reduction for network traffic characterization B. Zhu and A. Mozo In last decade, the research community has focused on new classification methods that rely on statistical characteristics of internet traffic, instead of pre-viously popular port-number-based or payload-based methods, which are under even bigger constrictions due to non-trustful (or unreliable) port number, the prevalence of application of encryption, more strict privacy policy and frequent changes of payload format. Different from previous methods, new approaches that utilize flowlevel or packet-level statistical features of internet traffic are able to avoid defects mentioned above, and combining with machine learning techniques, they have shown promising results. Some previous statistical-characteristic-based researches generated large feature sets of internet traffic; however, it s impossible to handle hundreds of features in nowadays big data scenario, only leading to horrible processing time and misleading classification results due to redundant and correlative data. As a consequence, a feature selection procedure is essential in the process of internet traffic characterization. A study of several feature selection techniques is demonstrated and different techniques are compared and critiqued taking into account specific requirements of internet characterization. A Telecom Analytics Framework for Dynamic Quality of Service Management A. Guadamillas, M.A. López, N. Maravitsas, A. Mozo and F. Pulvirenti Since the beginning of Internet, Internet Service Providers (ISP) have seen the need of giving to users traffic different treatments defined by agree-ments between ISP and customers. This procedure, known as Quality of Service Management, has not much changed in the last years (DiffServ and Deep Packet Inspection have been the most chosen mechanisms). However, the incremental growth of Internet users and services jointly with the application of recent Ma-chine Learning techniques, open up the possibility of going one step forward in the smart management of network traffic. In this paper, we first make a survey of current tools and techniques for QoS Management. Then we introduce clustering and classifying Machine Learning techniques for traffic characterization and the concept of Quality of Experience. Finally, with all these components, we present a brand new framework that will manage in a smart way Quality of Service in a telecom Big Data based scenario, both for mobile and fixed communications. Adaptive Quality of Experience (AQoE) control for Telecom Networks A. Bascuñana, F. Castro, D. Espadas, P. Sánchez and M.A. Monjas Early detection of quality of experience (QoE) degradation patterns which could end up in congestion situations is currently at the top of the agenda for the communication service providers (CSP), especially those providing mobile services. Mobile Operators already have the possibility of classifying users and applying congestion mitigation policies depending on the customer segment they be-long to. Classification can be done in different and even highly flexible ways, but no operator is currently giving their users automatic adaptation proce-dures so that their QoS meet their different expectations, profiles and device usage when QoE degradation scenarios occur. This position paper aims at describing this problem, analyzing, among other issues, the need to have a larger amount of processed 22 / 40

information about how and when users consume services, along with a complete profiling of said us-ers, and the need to build a new automated priority group distribution that enables the network to provide the best QoE while optimizing bandwidth us-age and other relevant parameters. IoT and Machine learning Ajit Jaokar (FutureText & Oxford University) Ajit Jaokar s work is based on identifying and researching cross-domain technology trends in Telecoms, Mobile and the Internet. Spanning academia and industry his current research interests is include Policy research, Big Data, Telecoms, Smart Cities, Big Data Analytics and IOT Ajit conducts a course at Oxford University on Big Data and Telecoms and also teaches at City Sciences (Technical University of Madrid) on Big Data Algorithms for future Cities / Internet of Things. SmartSteps and Big Data at Telefonica Jose-Luis Agúndez (Telefonica S.A.) In the fast-moving world of Big Data, technology is becoming less the subject of every talk, and focus is moving to its applications, and to how to respond to the concerns about privacy that people are voicing vehemently in the post-snowden era. There are several factors that should be highlighted to further focus this subject matter. The main one is the need to draw a clear dividing line between the "espionage stories" that fill our books and screens on one hand and the products and services that companies can develop under the supervision of data protection regulatory bodies on the other. Another subject requiring attention is the handling of data in online services that are beginning to attract the interest of regulators compared with services that have been highly regulated for some time, such as in banking, telecommunications, etc. And it is undoubtedly interesting to speak about the use of data made anonymous to be added at the statistical level, as would be the case with lists of registered voters at election time. Exploiting this background should help us to propose a series of requirements that a setting of transparency should meet to respond to each of the aforementioned concerns and conditions. Capturing Value from Big Financial Data. Daniel Villatoro (BBVA Data&Analytics) Financial data is an underutilized asset, one that focused properly can bring new opportunities to financial institutions and third parties alike. It is critical to understand that the creation of new opportunities, or the creation of new business value, is key to the successful adoption of the current and coming crops of big data solutions. Businesses based on customers' advanced analyses. Mónica León Santamaría (INDRA) A growing digitalization in society implies new challenges faced with new capacities in information management and customers analytics. Big Data, bigger problems for data protection? Celia Fernandez Aller (Universidad Politécnica de Madrid) Big data refers to the exponential growth both in the availability and in the automated use of information: it refers to very big digital datasets held by 23 / 40

corporations, governments and other large organizations, which are then extensively analyzed using computer algorithms. Big data can be used to identify more general trends and correlations but it can also be processed in order to directly affect individuals. The expectation from big data is that it may ultimately lead to better and more informed decisions. There are numerous applications of big data in various sectors, including healthcare, mobile communications, smart grid, traffic management, fraud detection, marketing and retail. With all its potential for innovation, big data may also pose significant risks for the protection of personal data and the right to privacy, as it increases the risk that people can lose control of their own data. In particular, big data raises concerns about: a) the sheer scale of data collection, tracking and profiling; b) the security of data; c) the transparency, which implies sufficient information given to individuals; d) inaccuracy, discrimination, exclusion and economic imbalance; d) increased possibilities of government surveillance. We will analyze how to apply European data protection principles to Big Data: First, the specified, explicit and legitimate purposes; then, the general compatibility assessment and the specific provisions on further processing for historical, statistical or scientific purposes, including appropriate safeguards that may help data controllers meet the compatibility test. The legal requirements are well established in the Directive 95/46/EC and will be modified by the Proposed Data Protection Regulation. The workshop proceedings have been generated in both digital and physical formats, and Annex 40A.2 shows how to access and download them. Some presentations from the industrial track are also available on the workshop website (in the Proceedings section). 24 / 40

7. Workshop Web 7.1 Workshop website as part of the ONTIC project portal A website was built and put online to cover the necessities of participants, attendees, and everyone interested in the workshop. This website is integrated in the ONTIC project portal, and can be accessed through the "Events" menu as seen in Figure 2: Access from ONTIC portal to BigDAP web. Figure 2: Access from ONTIC portal to BigDAP web The BigDAP website offers an extensive compilation of all information on the workshop. In the manner of the ONTIC portal, the workshop website contains links to the communication channels for diffusion and feedback. The website can also be accessed through the following URL: http://ict-ontic.eu/bigdap14/ 25 / 40

As shown in Figure 3: BigDAP web page. Figure 3: BigDAP web page The tools used for creating this website are the same ones that were used to build the ONTIC project portal. 7.2 Web structure The workshop webpage consists of the main menu and the right panel (channel panel). When the BigDAP website is first accessed, it is possible to see this menu structure containing the links as seen in Figure 4: Menu web BigDAP. 26 / 40

Figure 4: Menu web BigDAP The main menu contains eleven basic branches: 1. Home In the home branch, a section explaining the BigDAP workshop can be found. This information helps the visitor understand the importance and the repercussion of the workshop. In the manner of the main ONTIC portal, a panel on the right shows relevant news and the BigDAP Twitter feed. 2. Proceedings This section includes some links to download the BIGDAP-14 Proceedings Volume and several presentations from the Industrial Track. This section was enabled at the end of the workshop. 3. Scope This section contains the goals, the topics and the philosophy of the workshop. 4. Committees This section illustrates the composition of the committees. The institution of origin of each member is shown in this section as well. Members with the roles of chair and co-chair are shown first. This information corresponds to the contents of section 8 of this document. 5. Important Dates This section shows the paper submission and participant registration deadlines. This information corresponds to the contents of section 6.1 Definition of activities related to the workshop program. 6. Submission All information pertaining to the paper submission and review process is shown here, as well as the stipulated paper structure and format. A link to the EasyChair website allows applicants to send their papers. Details related to the publication of accepted papers in the proceedings are clearly explained. This information corresponds to the contents of sections 6.1.3 Submission process of this document 27 / 40

7. Registration This section contains details on the registration procedure and dates. Specifics on the registration process for participants can be found in section 9 Workshop development. This section was disable at the end of the workshop. 8. Accommodation This section contains information on hotels near the workshop venue (ETSISI). 9. Venue This section explains how to arrive to the ETSISI by different means of transportation. 10. About This section mentions the relation between BigDAP and its host and sponsor FP7 ONTIC. Contact information and geographical location of the workshop can also be found in this section. 11. Program The workshop schedule is available in this section. This information was published the 20th of August, 2014. Each entry of this table is a hyperlink to the abstract of the corresponding talk. As mentioned before, the panel on the right shows BigDAP news, as well as links to the ONTIC and BigDAP social platform profiles. This helps increase awareness and dissemination of the project's results. 28 / 40

8. Dissemination and advertising of the Workshop To promote the dissemination of the workshop, several different channels were used: Via email, using international email distribution lists from academic fields related to Computer Science in Europe. Via email, sent by the Programme and Organization Committee members to personal contacts in academic and industrial fields. Using web pages and other internal dissemination means from each of the consortium partners in their respective universities. Via the ONTIC Project webpages, using the Event and News sections. Via twitter, using the official ONTIC Project twitter. 8.1 Email distribution lists Several Calls for Papers were sent using international distribution email lists and personally sent to institutions, organizations and contacts by each of the members of the consortium and both of the workshop committees, as stipulated in section 6.1.1 Call for papers: of this document. The template used for this Call for Papers can be found at Annex A.1 of this document. Some of the email distribution lists used to disseminate BIGDAP are: PROLE, DISTJISBD, JCSD, TYPES-ANNOUNCE, CFP, BIG-DATA and MMB. 8.2 Visibility at the Universidad Politécnica de Madrid website The visibility and the publicity to the event were promoted before, during and after the completion of the event from the Universidad Politécnica de Madrid, at the ETSISI, website (URL: http://www.etsisi.upm.es/). Results derived from the workshop were published on the ETSISI website (http://www.etsisi.upm.es/noticia_destacada/celebraci%c3%b3n_del_1st_internatio nal_workshop_big_data_applications_and_principles, see Figure 5: ETSISI web divulgation of BigDAP 2014) as relevant news, valuing how the development and visibility of the event benefits the University, with emphasis on the association with the ONTIC Project and funding by the European Commission under the SEVENTH FRAMEWORK PROGRAMME. 29 / 40

Figure 5: ETSISI web divulgation of BigDAP 2014 30 / 40

619633 ONTIC. Deliverable 6.8! 8.3! Workshop promotion on audiovisual media Additionally, a poster to promote the event was designed by members of the Organizing Committee. This poster was distributed via email for its dissemination by consortium members. Internally, inside the ETSISI, the poster was published on the website and shown in screens strategically placed throughout Campus Sur, in the school. The poster can be seen in Figure 6: BigDAP 14 poster. Figure 6: BigDAP 14 poster 31 / 40

9. Workshop development 9.1 Registration Process Registration of attendees to the workshop was previously explained in section 7.2 Particularly, the registration was made by filling a Registration Form (downloadable from the workshop webpage at http://ict-ontic.eu/bigdap14/forms/bigdap14_registration_form.pdf) The Registration Form was then sent to BigDAP@ict-ontic.eu with subject 'BigDAP Registration', with an attached bank receipt. For external attendees of the workshop, the payment was defined as a sum that covered the expenses of each attendee (lunch, breakfast and coffee breaks) as well as the additional material handed to them during the event (promotional memory stick with the event logo, proceedings copy, folders, etc.). To ensure that capacity of the venue wasn t exceeded, an early payment option was offered (before August 20 th ) and depending on the number of attendants, a late registration period would be opened until 3 days before the start of the event, or until no more vacant places were available, as shown in the next table. Registration type Paid until Fee Full early registration August 20, 2014 130 Euros Full late registration September 8, 2014 150 Euros Student late registration September 8, 2014 50 Euros (without grant application) Table 4: Fees and types of registration 9.2 Receipt of grant application for students As part of the registration process, postgraduate and undergraduate students from universities part of the ONTIC Project consortium could apply for grants to attend to the event. The dissemination process of the grant applications followed the same channels described in the previous section of this document, referring the applicants to the BigDAP website section regarding the registration process. The requirements to be eligible for a grant application were: Submission of an official document certifying the student status of the applicant. Submission of the Registration Form filled in with the applicant info. Available at: http://ictontic.eu/bigdap14/forms/bigdap14_registration_form.pdf. Submission of the Grant Application form filled in, including a short CV and a short description of the motivation to attend the event. Available at: 32 / 40

http://ict-ontic.eu/bigdap14/forms/bigdap14_grant_application_form.pdf Students that applied for a grant had all their cost completely covered for their attendance to the workshop. 9.3 Attendees Attendees to the workshop could be sorted into one of the following groups: Students with grants: those that applied for a grant through the appropriate channels. Partners: members of the project s consortium. Consortium invited: people related to consortium members, personally invited by them to participate in the event. Speakers: authors of the accepted papers (submitted to the easy chair platform) and invited speakers. Rest of attendees: people without any connection to the event organization or the universities and organizations part of the consortium that were interested in the event and registered on their own account. In this group were students that didn t apply for a grant, and professors and members of companies interested in attending the event. Type of Attendees Number attendees Students with grants 26 Partners 9 Consortium invited 8 Speakers 17 Rest of attendees 6 Organization committee & chairman 9 Total of attendees 76 Table 5: Attendees statistics In Table 5, the total number of attendees can be seen broken down in the different groups. 33 / 40

10. Results Figure 7: Event pictures The results of the first BigDAP Workshop are: Presentation and dissemination of the state of the art in different areas of research and development in the field of Big Data, both in the academic and the industrial world. Networking: New contacts have been established with people connected to the industrial and academic and/or scientific fields, achieving an exchange of ideas and knowledge. A connection of the new contacts in the industrial and scientific fields to the ONTIC Project website and its different dissemination channels in social media. Encouragement of undergraduate and postgrad students, especially PhD students, to get involved in research fields related to Big Data, Big Analytics and Computer Science. Motivated postgrad and undergrad students to participate in the event, thanks to the grants given, funded by the European Commission. An organization and management structure has been established, that will serve as a foundation for future editions of the event. Creation of trademarks of the event: logo, posters, website with the name and logo that permits an easy identification of the event and the ONTIC Project, which is an important part of the dissemination of the event as an activity of the ONTIC Project. Created an integration space for the industrial and scientific field, with the intention of strengthening the relations and sharing of knowledge between both sectors. 34 / 40

Produced a proceedings volume, both in digital and physical format, with an associated ISBN. In this context, contact with Springer Verlag has been established for future editions of the event. 35 / 40

11. Future Plans During the organization of this first edition of the BigDAP Workshop some interesting ideas emerge, which will be considered for further editions of the event. Some of these ideas are: Future versions of Big Dap will include one track about PhD proposals of students. Releasing the future proceedings volumes assisted by a Publisher that will help indexing the works in the principal scientific lists: dblp, google scholar, etc. In this regard, initial contact has been established with Springer Verlag for future workshop editions. Establish new contacts and obtain new distribution lists to give more visibility to the event. Create an attendee contact list, to invite them to participate in future editions of the event. 36 / 40

Annex A A.1 Caller structure - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ***** SUBMISSION DEADLINE EXTENDED: July 31 ***** - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - CALL FOR PAPERS 1st International Workshop on Big Data Applications and Principles BigDAP 2014 Madrid, Spain, September 11-12, 2014 http://ict- ontic.eu/bigdap14/ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - BigDAP 2014 is a workshop aimed at promoting and displaying excellent research and innovation on Big Data. BigDAP 2014 encourages high- quality research in all branches of Big Data application. Their broad scope provides an opportunity to bring together researchers and industrial companies motivated by the exchange of theoretical, practical and experimental approaches in implemented real use cases. BigDAP 2014 is hosted and sponsored by FP7 ONTIC project (Online Network TraffIc Characterization, http://ict- ontic.eu) that is funded by European Commission. Therefore, the application of scalable Big Data analytics to network traffic characterization will be a key topic in this workshop. Significant room will be reserved to PhD students in BigDAP 2014, to allow them to share ideas and know- how related to Big Data topics. Venue: - - - - - - - - - BigDAP 2014 will be held in the School of Computer Systems Engineering (E.T.S. Ingeniería de Sistemas Informáticos, www.etsisi.upm.es), Technical University of Madrid (Universidad Politecnica de Madrid), Madrid, Spain, on September 11-12, 2014. Topics: - - - - - - - - - Topics of either theoretical or applied interest include, but are not limited to: * Techniques, models and algorithms for Big data * Scalable Data Mining and Machine learning techniques and mechanisms * Big Data frameworks and architectures * NoSQL, NewSQL and Graph databases * Verification, Validation and Testing Big Data applications 37 / 40