Web and Big Data at LIG. Marie-Christine Rousset (Pr UJF, déléguée scientifique du LIG)



Similar documents
fédération de données et de ConnaissancEs Distribuées en Imagerie BiomédicaLE Interrogation d'entrepôts distribués et hétérogènes

OLAP. Data Mining Decision

Standards for Big Data in the Cloud

BSc in Information Technology Degree Programme. Syllabus

3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC-2016) March 10-11, 2016 VIT University, Chennai, India

Héméra Inria Project Lab July 2010 June 2014

DataBridges: data integration for digital cities

Massive Cloud Auditing using Data Mining on Hadoop

GIS - AllianSTIC. Director : Prof. Dan Istrate (dan.istrate@esigetel.fr) Katarzyna Węgrzyn-Wolska, AllianSTIC. Page 1

Workprogramme

1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India

Concept and Project Objectives

The OpenCloudware collaborative project

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

Industry 4.0 and Big Data

IEEE International Conference on Computing, Analytics and Security Trends CAST-2016 (19 21 December, 2016) Call for Paper

Fédération et analyse de données distribuées en imagerie biomédicale

Welcome to: M2R Informatique & MoSIG Master of ScienceSep. in Informatics 18, 2009 Joseph 1 / 1Fou

The OpenCloudware collaborative project

Kimmo Rossi. European Commission DG CONNECT

fédération de données et de ConnaissancEs Distribuées en Imagerie BiomédicaLE Data fusion, semantic alignment, distributed queries

BSc in Information Systems & BSc in Information Technology Degree Programs

De la Business Intelligence aux Big Data. Marie- Aude AUFAURE Head of the Business Intelligence team Ecole Centrale Paris. 22/01/14 Séminaire Big Data

Professional Organization Checklist for the Computer Science Curriculum Updates. Association of Computing Machinery Computing Curricula 2008

María Elena Alvarado gnoss.com* Susana López-Sola gnoss.com*

OAK Database optimizations and architectures for complex large data Ioana MANOLESCU-GOUJOT

Semantic Data Management. Xavier Lopez, Ph.D., Director, Spatial & Semantic Technologies

Data and beyond

HADAS Group Heterogeneous Autonomous Distributed Data Services

Semantic Search in Portals using Ontologies

Technology Watch process in context: Information Systems (SI), Economic Intelligence (EI) and Knowledge Management (KM)

Technical Writing - Water Enhanced Resource Planning 3.5 to 4.5.1

Publishing Linked Data Requires More than Just Using a Tool

Performance Analysis, Data Sharing, Tools Integration: New Approach based on Ontology

Chapter 5. Warehousing, Data Acquisition, Data. Visualization

Code generation under Control

ANALYTICS STRATEGY: creating a roadmap for success

Il est repris ci-dessous sans aucune complétude - quelques éléments de cet article, dont il est fait des citations (texte entre guillemets).

Key Technology Study of Agriculture Information Cloud-Services

How To Write A New Book On Data Science

HPC technology and future architecture

Chapter ML:XI. XI. Cluster Analysis

Enabling End User Access to Big Data in the O&G Industry

From Distributed Computing to Distributed Artificial Intelligence

Master s Program in Information Systems

WHITE PAPER TOPIC DATE Enabling MaaS Open Data Agile Design and Deployment with CA ERwin. Nuccio Piscopo. agility made possible

Big Data Europe

Biomedical Informatics Applications, Big Data, & Cloud Computing

TRANSFoRm: Vision of a learning healthcare system

The University of Jordan

TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

NetView 360 Product Description

Ressources management and runtime environments in the exascale computing era

Disributed Query Processing KGRAM - Search Engine TOP 10

E-SCIENCE IN WESTERN FRANCE :

72. Ontology Driven Knowledge Discovery Process: a proposal to integrate Ontology Engineering and KDD

Introduction to Data Mining and Business Intelligence Lecture 1/DMBI/IKI83403T/MTI/UI

Data Mining, Predictive Analytics with Microsoft Analysis Services and Excel PowerPivot

City Data Pipeline. A System for Making Open Data Useful for Cities. stefan.bischof@tuwien.ac.at

An Ontology Based Method to Solve Query Identifier Heterogeneity in Post- Genomic Clinical Trials

DATA MANAGEMENT PLAN IN THE REAL LIFE SCIENCES

News about HPC and Inria

Politecnico di Torino. Porto Institutional Repository

- GREDOR - Gestion des Réseaux Electriques de Distribution Ouverts aux Renouvelables

KHRESMOI. Medical Information Analysis and Retrieval

Recent and Future Activities in HPC and Scientific Data Management Siegfried Benkner

Cluster, Grid, Cloud Concepts

Big Data Governance Certification Self-Study Kit Bundle

The Ontological Approach for SIEM Data Repository

How To Make Sense Of Data With Altilia

Smarter Grids for a Smarter Planet

MEng, BSc Computer Science with Artificial Intelligence

BUSINESS VALUE OF SEMANTIC TECHNOLOGY

Big Data Processing and Analytics for Mouse Embryo Images

Big Data Architect Certification Self-Study Kit Bundle

Big (User) (Health) Data Management

Semantically Steered Clinical Decision Support Systems

A Professional Big Data Master s Program to train Computational Specialists

UNIVERSITY OF INFINITE AMBITIONS. MASTER OF SCIENCE COMPUTER SCIENCE DATA SCIENCE AND SMART SERVICES

Course DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database System in Energy Data Management

MEng, BSc Applied Computer Science

Big Data R&D Initiative

LDIF - Linked Data Integration Framework

Imam Mohammad Ibn Saud Islamic University College of Computer and Information Sciences Department of Computer Sciences

CDPP in Europlanet/IDIS FP6 and FP7 C. Jacquey, N. André, B. Cecconi, V. Génot, C. Briand. M. Gangloff, M. Bouchemit, E. Budnik, E.

Big Data and Semantic Web in Manufacturing. Nitesh Khilwani, PhD Chief Engineer, Samsung Research Institute Noida, India

PRACTICAL DATA MINING IN A LARGE UTILITY COMPANY

Augmented Search for Web Applications. New frontier in big log data analysis and application intelligence

Exploring Big Data in Social Networks

A CIM-Based Framework for Utility Big Data Analytics

CONNECTING DATA WITH BUSINESS

Person Responsible for Module (Name, Mail address): Dr. Javier Soriano,

Sustainable Development with Geospatial Information Leveraging the Data and Technology Revolution

XML enabled databases. Non relational databases. Guido Rotondi

Data-intensive HPC: opportunities and challenges. Patrick Valduriez

Big Data Governance Certification Self-Study Kit Bundle

How To Create A Web Of Knowledge From Data And Content In A Web Browser (Web)

European Archival Records and Knowledge Preservation Database Archiving in the E-ARK Project

Transcription:

Web and Big Data at LIG Marie-Christine Rousset (Pr UJF, déléguée scientifique du LIG)

Data and Knowledge Processing at Large Scale Officers: Massih-Reza Amini - Jean-Pierre Chevallet Teams: AMA EXMO GETALP HADAS MRIM SLIDE STEAMER Scientific Focus: Data mining, Natural Language Processing, Machine learning, DBMS, GIS, Information Retrieval, Social networks, Semantic Web, Linked Data 2

Distributed Systems, Parallel Computing, and Networks Officers: Vivien Quéma - Arnaud Legrand Teams DRAKKAR MESCAL MOAIS NANOSIM ERODS Scientific Focus HPC Cloud Computing Future Internet Multi-Core Programming Parallel and Embedded Systems

LIG is involved in many projects and infrastructure (Clouds/HPC) for Big Data Analytics European projects FP7 ICT Exascale Mont-Blanc 1 (2011-2014) FP7 ICT Exascale Mont-Blanc 2 (2013-2016) FP7 IRSES HPC GA (2011-2014) FP7 BioASQ (2012-2014) (large-scale categorization and question-answering for the bio-medical domain) National projects FUI Minalogic SoCTrace (2011-2015)( Analysis of traces of execution produced by multi-core embedded applications). ANR Clouds@Home (2009-2013) ANR SONGS (2011-2015) FSN OpenCloudware (2012-2014) PIA DATALYSE (2013-2016) (intelligent warehouses for heterogeneous big data) ANR Class-Y (classification in large-scale taxonomies application to taxonomies as MeSH) (2011-2015) ANR Qualinca (methods and algorithms for quality and interoperability of large documentary catalogs) ANR PAGODA (practical algorithms for ontology-based data access). MASTODONS projects PROSPECTOM (interactive study of proteoms via statistical learning and data aggregation methods) ARESOS (machine learning/data mining/information access for social network analysis) GARGANTUA (theoretical aspects of machine learning/data mining for big data) Infrastructures Meso-centre Ciment (HPC platform in Grenoble) EMERA and Grid 5000 projects

DATALYSE (PIA: appel Cloud and Big Data) Goal: deliver a collection of efficient data processing tools, referred to as Datalysers, to prepare, transform, extract value from and visualize Big Data Joint work between research and industry Academics: LIG (HADAS, ERODS, TyRex), INRIA Saclay, LIFL, LIRMM Industry: Eolas, Business et Decision (B&D), STIME Mousquetaires Timeline: started in May 2013 for a period of 42 months Deliverable 1: Big Data preparation datalysers Deliverable 2: Big Data transformation datalysers Deliverable 3: Big Data visualization datalysers Datasets and Platforms: real datasets ranging from User Big Data (UBD) to Monitoring Big Data (MBD) Website: http://www.datalyse.fr

DATALYSE Use Cases Linked/Open Data Provide access to clean and enriched datasets on museums in Grenoble Datasets: UBD Application: visualization layer to improve users experience in museums Traffic Analysis Interactive data center traffic statistics for different ISPs, hosted applications, geographic regions and time periods Datasets: MBD Application: traffic anomaly detection Digital Marketing Mining customer traffic on hosted websites Datasets: UBD Application : optimize conversion rate by monitoring customer traffic Retail Determining what makes customers leave the store Datasets: UBD Application: help better organize promotional offers for recurring customers

Datalyse architecture

Data linkage and enrichment (geo-localized, personalized) Ontology-based information access and integration Semantic search Data disambiguisation 8 Semantic Web and Linked Open Data > 31 billion RDF triples

Semantic Web technologies are now mature for creating added-value to data and for innovative applications Example of the Living Book of Anatomy (funded by PERSYVAL-lab) Description of anatomic objects, constraints, functions and 3D models «3Dmodel1 describes the Sartorius which is a Muscle that participates to the Flexion of the Knee» Reasoning and querying capabilities «which 3D objects refer to muscles that participate to the Flexion of the Knee?» Evolutive and efficient tool for patient-specific 3D anatomic visualization and simulation

My Corporis Fabrica ontology Description of anatomic objects, constraints, functions and 3D aspects «3Dmodel1 describes the Sartorius which is a Muscle that participates to the Flexion of the Knee» Reasoning and Declarative Querying capabilities on knowledge «Which 3D objects refer to muscles that participate to the Evolutive and Efficient tool for Flexion of Knee?» 75000 classes, 11 rules, 1M RDF triplets knowledge driven 3D anatomic

Conclusion Le LIG a des compétences larges et transversales autour du Web et Big Data Allant des infrastructures HPC et Cloud, aux systèmes de gestion de données et de connaissances à grande échelle, et la visualisation d informations pour l aide à la décision humaine (équipe IIHM du LIG) Allant des aspects fondamentaux de la science des données aux aspects systèmes et appliqués Le LIG est impliqué dans de nombreux projets collaboratifs nationaux et Européens sur ces thématiques