Knowledge discovery from biological Big Data : scalability issues
|
|
|
- Maude Wilkins
- 10 years ago
- Views:
Transcription
1 Knowledge discovery from biological Big Data : scalability issues Marie-Dominique Devignes, Malika Smaïl, Emmanuel Bresso, Adrien Coulet, Chedy Raïssi, Amedeo Napoli Université de Lorraine, LORIA laboratory and INRIA Nancy Grand-Est, Orpailleur team, Nancy, France
2 From (big) data to knowledge KDD Information Raw Data K Problem solving, making decision KDD : Knowledge Discovery from Databases iterative and interactive process 6/12/2013 Big Data Challenges and Opportunities 2
3 A «big data» story in the life sciences Presented by Russ Altman (PharmGKB) on Youtube EngX webinar at Stanford Engineering School, nov12, Supervised statistical machine learning 2. Data subset related to eight classes of side effects 1. FDA Adverse Event Reporting System : (FAERS) Information Data K 4. Correlation models : Adverse reaction due to Drug Drug Interactions (DDI) Adverse event interpretation in electronic medical records Tatonetti et al., A novel signal detection algorithm for identifying hidden drug-drug interactions in adverse event reports JAMIA, 19:79-85, /12/2013 Big Data Challenges and Opportunities 3
4 The KDD bottlenecks in the life sciences Data and data sources Noisy, complex, heterogenenous, distributed, dynamic, etc. Need for «knowledge/model - driven» data integration Data selection Example and feature selection for machine learning Need for guidelines Parameters of data mining programs Experimental approach Need for efficient execution platforms Pattern evaluation and interpretation Big data mining can yield big volume of patterns! How to evaluate novelty, significance and consistency of a pattern at large scale? 6/12/2013 Big Data Challenges and Opportunities 4
5 Objectives of the talk 1. How do big data and biological databases cooperate? 2. How can bio-ontologies help in knowledge discovery? 3. Big data opportunities for the knowledge discovery process 6/12/2013 Big Data Challenges and Opportunities 5
6 Biological databases are Big Data More than 1500 biological databases today Curated data (not always) Complex schema Time-consuming update and integration Uniprot - Stats nov 2013 : SwissProt >542 KiloSeq for 192 MegaAA TrEMBL > 48 MegaSeq for 15 GigaAA Biological data has been «big data» for years! 6/12/2013 Big Data Challenges and Opportunities 6
7 Semantic web* as emerging biological Big Data Linked Open Data (LOD) Interconnected data Freely accessible on the web RDF Resource Description Framework {Subject, Property, Object} URI (Uniform Resource Identifier) Bio2RDF project 1 Tera triple graph in july 2013 Uniprot KeggPathway hsa:nnn Has_gene KeggGene hsa:ggg Has_domain See_Also, Xref Uniprot sp:ppp Interpro ipr:ddd *Semantic Web is a group of technologies to allow computers to autonomously process information resources without human intervention by annotating the meaning or "semantics" to them" (coined by Tim Berners-Lee in 1998). 6/12/2013 Big Data Challenges and Opportunities 7
8 From databases to RDF triples «RDFization» of database contents Database fact RDF triple Database Graph e.g. A protein P:pppp containing a domain D:ddd Uniprot sp:ppp Has_domain Interpro ipr:ddd = «EBI Sparql end-point» 6/12/2013 Big Data Challenges and Opportunities 8
9 Cooperation between LOD and databases Classical databases can provide reliable curated information to complement and enrich information extracted from LOD Project EXPLOD-BioMed (Adrien Coulet) Exploring LOD in the purpose of mining biomedical data Collect data about the genes responsible for intellectual disability Use Bio2RDF or EBI/RDF SPARQL endpoints Incomplete «RDFization» -> complete the datasets by querying classical databases + RDF representation of results Storing retrieved RDF triples into a triple store Or back to a relational DB (!) for easy design of KDD workflows using Knowledge Discovery Environments (such as KNIME) 6/12/2013 Big Data Challenges and Opportunities 9
10 Flexibility versus Semantics : research opportunities Moving from relational DB to NoSQL storage systems Schema-less data -> lack of documentation, loss of semantics New management systems to be invented Analytic tools need to be adapted to such systems Mahout MOA PEGASUS Fayyad UM (2012) Big data everywhere and No SQL in sight. SIGKDD explorations, 14: i-ii Fan W and Bifet A (2012) Mining big data: current status and forecast to the future. SIGKDD explorations, 14: 1-5 6/12/2013 Big Data Challenges and Opportunities 10
11 Objectives of the talk 1. How do big data and biological databases cooperate? 2. How can bio-ontologies help in knowledge discovery? 3. Big data opportunities for the knowledge discovery process 6/12/2013 Big Data Challenges and Opportunities 11
12 KDDK : Knowlege Discovery guided by Domain Knowledge in the Orpailleur team DB1 3. Result interpretation DB2 Domain Knowledge Data Knowledge Base (KB) 2. Data Mining DB3 Etc. 1. Data extraction and formatting Data integration Data mining Coulet et al. Ontology-based knowledge discovery in pharmacogenomics. Adv Exp Med Biol. 2011;696: /12/2013 Big Data Challenges and Opportunities 12
13 Bio-ontologies, an asset in the life sciences Ontologies = knowledge representation From hierarchical vocabularies e.g. MeSH, MedDRA, GO, SNOMED, ICD To logical representation of concepts and relationships e. g. SIO Semanticscience Integrated Ontology, UMLS Semantic Types, SOPharm Usages (semantic web technologies) Model layer of knowledge bases Semantic enrichment e.g. Onto-Tools, IntelliGO Cross-ressource data retrieval e.g. NCBO Resource Index 6/12/2013 Big Data Challenges and Opportunities 13
14 National Center for Bio-Ontologies : NCBO bioportal 6/12/2013 Big Data Challenges and Opportunities 14
15 BIO-Ontologies and LOD exploration 366 bio-ontologies at the NCBO BioPortail: 6 Mega concepts 39 biological resources: UniProt, GO, ArrayExpress, GEO, PharmGKB, etc. 5 Mega records 24,8 Giga annotations (Jonquet C et al. (2011) NCBO Resource Index: Ontology-Based Search and Mining of Biomedical Resources. Web Semantics 9: ) Statistics updated November /12/2013 Big Data Challenges and Opportunities 15
16 Exploring resources with the Resource Index 6/12/2013 Big Data Challenges and Opportunities 16
17 Bio-ontologies and dimension reduction Big data often mean high-dimensional data Statistical methods for feature selection Many possible methods Clustering similar features using a terminology and semantic similarity measure E.g. semantic clustering of 1288 MedDRA adverse effect terms -> 112 term clusters Enables execution of symbolic data mining methods such as frequent itemset search Bresso et al. (2013) Integrative relational Machine-Learning Approach for Understanding Drug Side-Effect Profiles. BMC Bioinformatics,14(1):207. 6/12/2013 Big Data Challenges and Opportunities 17
18 Objectives of the talk 1. How do big data and biological databases cooperate? 2. How can bio-ontologies help in knowledge discovery? 3. Big data opportunities for the knowledge discovery process 6/12/2013 Big Data Challenges and Opportunities 18
19 Big data as a reservoir of data for validating hypotheses and models Huge data sets become available for mining The amount of effort required to warehouse data often means that valuable data sources in organizations are never mined. This is where Hadoop can make a big difference (Eric Dumbhill, Big Data now, 2012) Adverse events -> grouping medical records from different hospitals is useful to enlarge the dataset Data mining often generate more than one model, sometimes a huge amount of patterns Training set requires integrated curated data Test set can be extracted from big data -> statistical evaluation 6/12/2013 Big Data Challenges and Opportunities 19
20 The critical «Vs» of Big Data in the Life Sciences Variety and variability New data types provided by high-throughput technologies (OMICS data but also images from microscopy devices ) Value : FAERS and drug drug interaction -> better control of drug treatments Individual genomes -> personalized medicine Veracity Multiple source integration means detecting and managing possible inconsistencies Quality and provenance metadata in the LOD Bio2RDF uses DublinCore metadata triples and calculates 9 metrics for each dataset Popularity ranking, cross-reference degree 6/12/2013 Big Data Challenges and Opportunities 20
21 New paradigms for knowledge discovery Cooperation between symbolic and statistical methods Statistical feature selection before symbolic data mining Automatic filtering and/or ranking of patterns using statistical significance measurements before expert interpretation Adaptive learning systems Label propagation on big data objects 6/12/2013 Big Data Challenges and Opportunities 21
22 Other projects in the Orpailleur team Research projects Parallelization of CORON tools ( A suite of tools for symbolic data mining and formal concept analysis Text mining (ANR Hybride : ) Collaboration with Orphanet Graph mining for chemical reactions Pennerath F, Niel G, Vismara P., Jauffret P., Laurenço C., Napoli A. (2010) "A graph-mining algorithm for the evaluation of bond formability". Journal of Chemical Information and Modeling, 50: Spatio-temporal mining of agronomical data Mari JF, Lazrak E-G, Benoît M (2013) Time space stochastic modelling of agricultural landscapes for environmental issues. Environmental Modelling and Software 46: Education : TELECOM Nancy ( ) Training engineers as «Data Scientists», Masters level Ingénierie et Applications des Masses de Données (IAMD) 6/12/2013 Big Data Challenges and Opportunities 22
23 Conclusion LOD and biological databases can cooperate in the KDD process Bio-ontologies are a major asset in the Life Sciences For data exploration For dimension reduction Semantic web technology scales up at RDF level But not yet at the OWL and reasoning level HPC computing and programs can process big data But Knowledge Discovery remains a human-guided process 6/12/2013 Big Data Challenges and Opportunities 23
24 References Big Data Now. O Reilly Media Inc. 1st edition, october 2012, (123 p.) Bresso E, Grisoni R, Marchetti G, Karaboga AS, Souchet M, Smaïl-Tabbone M (2013) Integrative relational Machine-Learning Approach for Understanding Drug Side-Effect Profiles. BMC Bioinformatics.14:207. Callahan A, Cruz-Toledo J, Dumontier M. (2013) Ontology-Based Querying with Bio2RDF's Linked Open Data. J Biomed Semantics. 15:4 Coakley MF, Leerkes MR, Barnett J, Gabrielian AE, Noble K, Weber MN and Huyen Y. Unlocking the power of big data at the NIH (Meeeting Report) Big Data September Coulet A, Smaïl-Tabbone M, Napoli A, Devignes MD (2011) Ontology-based knowledge discovery in pharmacogenomics. Adv Exp Med Biol. 696: Fan W and Bifet A (2012) Mining big data : current status and forecast to the future. SIGKDD explorations, 14:1-5 Fayyad U (2012) Big data everywhere and No SQL in sight. SIGKDD explorations, 14: i-ii Higdon R, Haynes W, Stanberry L, Stewart E, Yandl G, Howard C, Broomall W, Kolker N and Kolker E (2013) Unraveling the complexities of life sciences data. Big Data March Hoehndorf R, DumontierM and Gkoutos G (2012) Evaluation of research in biomedical ontologies. Briefings in Bioinformatics. Sept 8, 2012, Jonquet C, Lependu P, Falconer S, Coulet A, Noy NF, Musen MA, Shah NH (2011) NCBO Resource Index: Ontology-Based Search and Mining of Biomedical Resources. Web Semantics 9: Tatonetti NP, Fernald GH, Altman RB (2012) A novel signal detection algorithm for identifying hidden drug-drug interactions in adverse event reports. J Am Med Inform Assoc.19: /12/2013 Big Data Challenges and Opportunities 24
25 Thank you for your attention! but 6/12/2013 Big Data Challenges and Opportunities 25
Complexity and Scalability in Semantic Graph Analysis Semantic Days 2013
Complexity and Scalability in Semantic Graph Analysis Semantic Days 2013 James Maltby, Ph.D 1 Outline of Presentation Semantic Graph Analytics Database Architectures In-memory Semantic Database Formulation
Big Data Management Assessed Coursework Two Big Data vs Semantic Web F21BD
Big Data Management Assessed Coursework Two Big Data vs Semantic Web F21BD Boris Mocialov (H00180016) MSc Software Engineering Heriot-Watt University, Edinburgh April 5, 2015 1 1 Introduction The purpose
AgroPortal. a proposition for ontologybased services in the agronomic domain
AgroPortal a proposition for ontologybased services in the agronomic domain Clément Jonquet, Esther Dzalé-Yeumo, Elizabeth Arnaud, Pierre Larmande Why ontologies? Why an ontology repository? 2 Biologist
Lecture 11 Data storage and LIMS solutions. Stéphane LE CROM [email protected]
Lecture 11 Data storage and LIMS solutions Stéphane LE CROM [email protected] Various steps of a DNA microarray experiment Experimental steps Data analysis Experimental design set up Chips on catalog
Provenance-Centered Dataset of Drug-Drug Interactions
Provenance-Centered Dataset of Drug-Drug Interactions Juan M. Banda 1, Tobias Kuhn 2,3, Nigam H. Shah 1, and Michel Dumontier 1 1 Stanford University - Center for Biomedical Informatics Research, 1265
The Development of the Clinical Trial Ontology to standardize dissemination of clinical trial data. Ravi Shankar
The Development of the Clinical Trial Ontology to standardize dissemination of clinical trial data Ravi Shankar Open access to clinical trials data advances open science Broad open access to entire clinical
BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS
BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS 1. The Technology Strategy sets out six areas where technological developments are required to push the frontiers of knowledge
Industry 4.0 and Big Data
Industry 4.0 and Big Data Marek Obitko, [email protected] Senior Research Engineer 03/25/2015 PUBLIC PUBLIC - 5058-CO900H 2 Background Joint work with Czech Institute of Informatics, Robotics and
How To Use Data Analysis To Get More Information From A Computer Or Cell Phone To A Computer
Applying Big Data approaches to Competitive Intelligence challenges THOMSON REUTERS IP & SCIENCE PHARMA CI EUROPE CONFERENCE & EXHIBITION TIM MILLER 19 FEBRUARY 2014 BIG DATA, NOT JUST ABOUT VOLUMES Patient
Data-intensive HPC: opportunities and challenges. Patrick Valduriez
Data-intensive HPC: opportunities and challenges Patrick Valduriez Big Data Landscape Multi-$billion market! Big data = Hadoop = MapReduce? No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard,
Introduction to Data Mining
Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association
Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing
Database Marketing, Business Intelligence and Knowledge Discovery
Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski
Data, Measurements, Features
Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are
Chapter 5. Warehousing, Data Acquisition, Data. Visualization
Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization 5-1 Learning Objectives
Find the signal in the noise
Find the signal in the noise Electronic Health Records: The challenge The adoption of Electronic Health Records (EHRs) in the USA is rapidly increasing, due to the Health Information Technology and Clinical
Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep. Neil Raden Hired Brains Research, LLC
Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep Neil Raden Hired Brains Research, LLC Traditionally, the job of gathering and integrating data for analytics fell on data warehouses.
AgroPortal : a proposition for ontology-based services in the agronomic domain
AgroPortal : a proposition for ontology-based services in the agronomic domain Clément Jonquet, 1,2 Esther Dzalé-Yeumo, 3 Elizabeth Arnaud, 4 Pierre Larmande 2,5 1 Laboratory of Informatics, Robotics and
Using Big Data in Healthcare
Speaker First Plenary Session THE USE OF "BIG DATA" - WHERE ARE WE AND WHAT DOES THE FUTURE HOLD? David R. Holmes III, PhD Mayo Clinic College of Medicine Rochester, MN, USA Using Big Data in Healthcare
International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014
RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer
72. Ontology Driven Knowledge Discovery Process: a proposal to integrate Ontology Engineering and KDD
72. Ontology Driven Knowledge Discovery Process: a proposal to integrate Ontology Engineering and KDD Paulo Gottgtroy Auckland University of Technology [email protected] Abstract This paper is
From Data to Foresight:
Laura Haas, IBM Fellow IBM Research - Almaden From Data to Foresight: Leveraging Data and Analytics for Materials Research 1 2011 IBM Corporation The road from data to foresight is long? Consumer Reports
Chapter ML:XI. XI. Cluster Analysis
Chapter ML:XI XI. Cluster Analysis Data Mining Overview Cluster Analysis Basics Hierarchical Cluster Analysis Iterative Cluster Analysis Density-Based Cluster Analysis Cluster Evaluation Constrained Cluster
An EVIDENCE-ENHANCED HEALTHCARE ECOSYSTEM for Cancer: I/T perspectives
An EVIDENCE-ENHANCED HEALTHCARE ECOSYSTEM for Cancer: I/T perspectives Chalapathy Neti, Ph.D. Associate Director, Healthcare Transformation, Shahram Ebadollahi, Ph.D. Research Staff Memeber IBM Research,
INRA's Big Data perspectives and implementation challenges. Pascal Neveu UMR MISTEA INRA - Montpellier
INRA's Big Data perspectives and implementation challenges UMR MISTEA INRA - Montpellier Agronomic Sciences Raises integrated issues and challenges: How to adapt agriculture to climate change? How agriculture
> Semantic Web Use Cases and Case Studies
> Semantic Web Use Cases and Case Studies Case Study: Applied Semantic Knowledgebase for Detection of Patients at Risk of Organ Failure through Immune Rejection Robert Stanley 1, Bruce McManus 2, Raymond
BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata
BIG DATA: FROM HYPE TO REALITY Leandro Ruiz Presales Partner for C&LA Teradata Evolution in The Use of Information Action s ACTIVATING MAKE it happen! Insights OPERATIONALIZING WHAT IS happening now? PREDICTING
The Ontological Approach for SIEM Data Repository
The Ontological Approach for SIEM Data Repository Igor Kotenko, Olga Polubelova, and Igor Saenko Laboratory of Computer Science Problems, Saint-Petersburg Institute for Information and Automation of Russian
Semantic Modeling with RDF. DBTech ExtWorkshop on Database Modeling and Semantic Modeling Lili Aunimo
DBTech ExtWorkshop on Database Modeling and Semantic Modeling Lili Aunimo Expected Outcomes You will learn: Basic concepts related to ontologies Semantic model Semantic web Basic features of RDF and RDF
De la Business Intelligence aux Big Data. Marie- Aude AUFAURE Head of the Business Intelligence team Ecole Centrale Paris. 22/01/14 Séminaire Big Data
De la Business Intelligence aux Big Data Marie- Aude AUFAURE Head of the Business Intelligence team Ecole Centrale Paris 22/01/14 Séminaire Big Data 1 Agenda EvoluHon of Business Intelligence SemanHc Technologies
Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova
Using the Grid for the interactive workflow management in biomedicine Andrea Schenone BIOLAB DIST University of Genova overview background requirements solution case study results background A multilevel
Protein Protein Interaction Networks
Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics
BIG. Big Data Analysis John Domingue (STI International and The Open University) Big Data Public Private Forum
Big Data Analysis John Domingue (STI International and The Open University) Project co-funded by the European Commission within the 7th Framework Program (Grant Agreement No. 257943) 1 The Data landscape
Publishing Linked Data Requires More than Just Using a Tool
Publishing Linked Data Requires More than Just Using a Tool G. Atemezing 1, F. Gandon 2, G. Kepeklian 3, F. Scharffe 4, R. Troncy 1, B. Vatant 5, S. Villata 2 1 EURECOM, 2 Inria, 3 Atos Origin, 4 LIRMM,
Introduction. A. Bellaachia Page: 1
Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.
Semantic Data Management. Xavier Lopez, Ph.D., Director, Spatial & Semantic Technologies
Semantic Data Management Xavier Lopez, Ph.D., Director, Spatial & Semantic Technologies 1 Enterprise Information Challenge Source: Oracle customer 2 Vision of Semantically Linked Data The Network of Collaborative
THE SEMANTIC WEB AND IT`S APPLICATIONS
15-16 September 2011, BULGARIA 1 Proceedings of the International Conference on Information Technologies (InfoTech-2011) 15-16 September 2011, Bulgaria THE SEMANTIC WEB AND IT`S APPLICATIONS Dimitar Vuldzhev
Data collection architecture for Big Data
Data collection architecture for Big Data a framework for a research agenda (Research in progress - ERP Sense Making of Big Data) Wout Hofman, May 2015, BDEI workshop 2 Big Data succes stories bias our
Classifying Adverse Events From Clinical Trials
Classifying Adverse Events From Clinical Trials Bernard LaSalle, Richard Bradshaw University of Utah, Biomedical Informatics, Salt Lake City, UT USA [email protected] Abstract The use of adverse
The Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics
Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics Dr. Liangxiu Han Future Networks and Distributed Systems Group (FUNDS) School of Computing, Mathematics and Digital Technology,
Clustering Technique in Data Mining for Text Documents
Clustering Technique in Data Mining for Text Documents Ms.J.Sathya Priya Assistant Professor Dept Of Information Technology. Velammal Engineering College. Chennai. Ms.S.Priyadharshini Assistant Professor
Data analysis of L2-L3 products
Data analysis of L2-L3 products Emmanuel Gangler UBP Clermont-Ferrand (France) Emmanuel Gangler BIDS 14 1/13 Data management is a pillar of the project : L3 Telescope Caméra Data Management Outreach L1
Introduction to Data Mining
Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:
LDIF - Linked Data Integration Framework
LDIF - Linked Data Integration Framework Andreas Schultz 1, Andrea Matteini 2, Robert Isele 1, Christian Bizer 1, and Christian Becker 2 1. Web-based Systems Group, Freie Universität Berlin, Germany [email protected],
Presente e futuro del Web Semantico
Sistemi di Elaborazione dell informazione II Corso di Laurea Specialistica in Ingegneria Telematica II anno 4 CFU Università Kore Enna A.A. 2009-2010 Alessandro Longheu http://www.diit.unict.it/users/alongheu
LiDDM: A Data Mining System for Linked Data
LiDDM: A Data Mining System for Linked Data Venkata Narasimha Pavan Kappara Indian Institute of Information Technology Allahabad Allahabad, India [email protected] Ryutaro Ichise National Institute of
How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning
How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume
Big Data, Fast Data, Complex Data. Jans Aasman Franz Inc
Big Data, Fast Data, Complex Data Jans Aasman Franz Inc Private, founded 1984 AI, Semantic Technology, professional services Now in Oakland Franz Inc Who We Are (1 (2 3) (4 5) (6 7) (8 9) (10 11) (12
SEMANTIC WEB BASED INFERENCE MODEL FOR LARGE SCALE ONTOLOGIES FROM BIG DATA
SEMANTIC WEB BASED INFERENCE MODEL FOR LARGE SCALE ONTOLOGIES FROM BIG DATA J.RAVI RAJESH PG Scholar Rajalakshmi engineering college Thandalam, Chennai. [email protected] Mrs.
Semantic Interoperability
Ivan Herman Semantic Interoperability Olle Olsson Swedish W3C Office Swedish Institute of Computer Science (SICS) Stockholm Apr 27 2011 (2) Background Stockholm Apr 27, 2011 (2) Trends: from
Big Data and Semantic Web in Manufacturing. Nitesh Khilwani, PhD Chief Engineer, Samsung Research Institute Noida, India
Big Data and Semantic Web in Manufacturing Nitesh Khilwani, PhD Chief Engineer, Samsung Research Institute Noida, India Outline Big data in Manufacturing Big data Analytics Semantic web technologies Case
Conquering the Astronomical Data Flood through Machine
Conquering the Astronomical Data Flood through Machine Learning and Citizen Science Kirk Borne George Mason University School of Physics, Astronomy, & Computational Sciences http://spacs.gmu.edu/ The Problem:
Text Mining for Health Care and Medicine. Sophia Ananiadou Director National Centre for Text Mining www.nactem.ac.uk
Text Mining for Health Care and Medicine Sophia Ananiadou Director National Centre for Text Mining www.nactem.ac.uk The Need for Text Mining MEDLINE 2005: ~14M 2009: ~18M Overwhelming information in textual,
Using Semantic Data Mining for Classification Improvement and Knowledge Extraction
Using Semantic Data Mining for Classification Improvement and Knowledge Extraction Fernando Benites and Elena Sapozhnikova University of Konstanz, 78464 Konstanz, Germany. Abstract. The objective of this
Integrating a Big Data Platform into Government:
Integrating a Big Data Platform into Government: Drive Better Decisions for Policy and Program Outcomes John Haddad, Senior Director Product Marketing, Informatica Digital Government Institute s Government
BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON
BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing
LinkZoo: A linked data platform for collaborative management of heterogeneous resources
LinkZoo: A linked data platform for collaborative management of heterogeneous resources Marios Meimaris, George Alexiou, George Papastefanatos Institute for the Management of Information Systems, Research
Outline. What is Big data and where they come from? How we deal with Big data?
What is Big Data Outline What is Big data and where they come from? How we deal with Big data? Big Data Everywhere! As a human, we generate a lot of data during our everyday activity. When you buy something,
Improving EHR Semantic Interoperability Future Vision and Challenges
Improving EHR Semantic Interoperability Future Vision and Challenges Catalina MARTÍNEZ-COSTA a,1 Dipak KALRA b, Stefan SCHULZ a a IMI,Medical University of Graz, Austria b CHIME, University College London,
Cray: Enabling Real-Time Discovery in Big Data
Cray: Enabling Real-Time Discovery in Big Data Discovery is the process of gaining valuable insights into the world around us by recognizing previously unknown relationships between occurrences, objects
Big Data: Study in Structured and Unstructured Data
Big Data: Study in Structured and Unstructured Data Motashim Rasool 1, Wasim Khan 2 [email protected], [email protected] Abstract With the overlay of digital world, Information is available
MarkLogic Semantics in Healthcare and Life Sciences for LIDER COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
MarkLogic Semantics in Healthcare and Life Sciences for LIDER The Only Enterprise NoSQL Database Search & Query ACID Transactions High Availability / Disaster Recovery Replication Government-grade Security
Linked Statistical Data Analysis
Linked Statistical Data Analysis Sarven Capadisli 1, Sören Auer 2, Reinhard Riedl 3 1 Universität Leipzig, Institut für Informatik, AKSW, Leipzig, Germany, 2 University of Bonn and Fraunhofer IAIS, Bonn,
Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce
Analytics in the Cloud Peter Sirota, GM Elastic MapReduce Data-Driven Decision Making Data is the new raw material for any business on par with capital, people, and labor. What is Big Data? Terabytes of
Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料
Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料 美 國 13 歲 學 生 用 Big Data 找 出 霸 淩 熱 點 Puri 架 設 網 站 Bullyvention, 藉 由 分 析 Twitter 上 找 出 提 到 跟 霸 凌 相 關 的 詞, 搭 配 地 理 位 置
COMP9321 Web Application Engineering
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411
Alejandro Vaisman Esteban Zimanyi. Data. Warehouse. Systems. Design and Implementation. ^ Springer
Alejandro Vaisman Esteban Zimanyi Data Warehouse Systems Design and Implementation ^ Springer Contents Part I Fundamental Concepts 1 Introduction 3 1.1 A Historical Overview of Data Warehousing 4 1.2 Spatial
Information Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli ([email protected])
Advanced In-Database Analytics
Advanced In-Database Analytics Tallinn, Sept. 25th, 2012 Mikko-Pekka Bertling, BDM Greenplum EMEA 1 That sounds complicated? 2 Who can tell me how best to solve this 3 What are the main mathematical functions??
International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 ISSN 2229-5518
International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 Over viewing issues of data mining with highlights of data warehousing Rushabh H. Baldaniya, Prof H.J.Baldaniya,
Supporting Change-Aware Semantic Web Services
Supporting Change-Aware Semantic Web Services Annika Hinze Department of Computer Science, University of Waikato, New Zealand [email protected] Abstract. The Semantic Web is not only evolving into
A leader in the development and application of information technology to prevent and treat disease.
A leader in the development and application of information technology to prevent and treat disease. About MOLECULAR HEALTH Molecular Health was founded in 2004 with the vision of changing healthcare. Today
HUAWEI Advanced Data Science with Spark Streaming. Albert Bifet (@abifet)
HUAWEI Advanced Data Science with Spark Streaming Albert Bifet (@abifet) Huawei Noah s Ark Lab Focus Intelligent Mobile Devices Data Mining & Artificial Intelligence Intelligent Telecommunication Networks
Collaborations between Official Statistics and Academia in the Era of Big Data
Collaborations between Official Statistics and Academia in the Era of Big Data World Statistics Day October 20-21, 2015 Budapest Vijay Nair University of Michigan Past-President of ISI [email protected] What
