The Advantages of Using Visual Interfaces

Size: px
Start display at page:

Download "The Advantages of Using Visual Interfaces"

Transcription

1 Knowledge Extraction and Integration using Automatic and Visual Methods Vedran Sabol, Roman Kern, Barbara Kump, Viktoria Pammer, Michael Granitzer vsabol rkern bkump vpammer Know-Center, Inffeldgasse 21a, 8010 Graz, Austria Introduction Techniques for efficiently accessing, analyzing and presenting large amounts of dynamic, heterogeneous data, with the goal of acquiring new knowledge and facts, play an increasingly important role in various application domains, such as media, patent databases, scientific publication repositories, medical databases etc. We identify two areas of research which, we believe, can provide significant improvements and benefits for fact discovery and acquisition of knowledge from large amounts of data, particularly when combined and applied together: There is a large potential in combining the knowledge present in structured, semantic data, such as open linked data (DBPedia, GeoData, Friend of a Friend, etc.), with unstructured information, such as human readable content. Integration of structured and unstructured data will facilitate the development of advanced data mining and knowledge extraction techniques. Visual interfaces are a powerful means for exploring and analyzing large amounts of data and providing access to knowledge and facts. Additionally, visual interfaces can be used as a common discourse platform supporting collaboration and networking, and empowering users to share their insights with co-workers and contribute knowledge into the Linked Data Cloud. Vision We envision an approach combining automatic methods and visual techniques for discovery of facts and deriving new knowledge from massive data. Besides data and content, which contain implicit knowledge, two explicit knowledge sources shall be utilized in the process: semantic databases and human expertise. Patterns and facts, which are either discovered by algorithms or unveiled through visual analysis, shall be validated by humans and integrated into Linked Data repositories through dedicated visual interfaces (Figure 1). The advantage of integrating the newly acquired facts and knowledge is twofold: it can be used by extraction and mining algorithms to improve their performance. it is made available to other people - either direct collaborators or people independently working on related topics. It should be noted that the proposed scenario borrows from the ideas of the social Web, whereby instead of publishing new user-generated content on various Web platforms, the focus here is on deriving and sharing of new facts and knowledge through integration with existing knowledge bases (i.e. Linked Data).

2 LOD Cloud Data/Content Repositories Data Content Knowledge Validated Facts and Knowledge Extraction of Sematics Data Mining Knowledge, Extracted Facts, Patterns Visual Interfaces Pattern recognition Hypothesis generation & validation Deriving new knowledge Figure 1: Extraction of new knowledge and facts and their integration into the Linked Data Cloud using a combination of automatic and visual methods. Semantic Enrichment and Data Mining Extraction of semantics from unstructured data such as text is a well established area of research, however extraction and disambiguation of facts still poses significant challenges. Current information extraction systems [Etzioni et. al. 2010] are capable of extracting simple, frequent factual relationships from open data sets like the Web, but extraction and disambiguation of non-frequent entities for specific domains, and integration of the extracted semantics into Linked Data repositories remains a major challenge. Integration of different, ontologically encoded knowledge can be achieved through ontology alignment or ontology merging algorithms [Euzenat & Shvaiko 2007], however approaches involving human intervention through the use of visual interfaces may help alleviate algorithm limitations [Granitzer et al. 2010]. Use of semantic information for advancing data mining methods appears as a promising research topics: it has been shown by several authors that semantic information can be successfully used in mining tasks, such as for example in text clustering [Hotho et al. 2002], [Szczuka et al. 2011]. Our goal is to develop methods for extraction of semantic information and facts, and enrichment of human readable content, such as governmental information, scientific publications, patent databases, media articles, user generated content etc. Newly extracted knowledge and facts shall be integrated with the Linked Open Data Cloud. The resulting methods will be based on natural language processing (NLP) and machine

3 learning techniques, such as scalable text clustering [Muhr et al. 2010a] and cluster labelling [Muhr et al. 2010b]. It should be noted that in the above described setup, several challenges deserve particular attention: entity disambiguation techniques [Kern et al. 2010], [Kern et al. 2011b], information diffusion and reuse [Kern et al. 2011a], and information quality and trust [Lex et al. 2010]. Visual Access and Analysis Visual Analytics [Thomas & Cook 2011, Keim et al. 2010] is a research field focusing at supporting humans in analytical reasoning over massive data sets using visual interfaces. It strives to effectively integrate human knowledge and experience into complex analytical processes by suitably combining machine processing with visual analysis methods. Our goal is to develop visual analysis techniques for large unstructured repositories and for Linked Data repositories. Visual interfaces built atop automatic techniques, shall provide an intuitive access to data and knowledge, pattern recognition possibilities and integration of the discovered, validated facts back into the Linked Data Cloud. Ontological information can be used to improve the presentation of visual interfaces in a variety of ways [Paulheim & Probst 2010]. Ontologically described user interface components will be easily adaptable to various data types and sources, allowing users to explore, analyse and manipulate information delivered by semantic enrichment, mining and integration methods. Visual components for assisting and simplifying data binding from Linked Open Data repositories to user interface elements shall also be considered. A visualization system providing methods for exploring and analysing massive repositories, and for integrating the discovered knowledge into semantic knowledge repositories, will be composed of a variety of visualization components which can be grouped into two categories [Granitzer et al. 2011]: discovery components, where the information flow is from the repository to the users, and description components suitable for expressing knowledge and integrating it with the Linked Data. Discovery Components serve analytical purposes such as explorative navigation and discovery of new insights. Examples are charting components (for example bar, pie, line and spider charts) and advanced analytical components targeting abstract information such as graphs and networks, topical relatedness, change and temporal information etc. [Sabol et al. 2009]. Selection of suitable visual metaphors for a particular task and data shall be performed (semi-)automatically based on available best practices [Lengler & Eppler 2007]. Description components are customizable knowledge visualization [Eppler & Burkhard 2004], [Bertschi et al. 2011] metaphors empowering users to intuitively express and communicate facts and knowledge. A new visualization shall initially be created as an empty "skeleton" of a chosen metaphor, where the user applies a "Builder Tool" to construct a visual representation expressing (newly acquired) knowledge. In doing so the expressed facts shall be integrated into the Linked Data Cloud. Also, if made publicly available, such a visual representation would provide a platform for transferring and communicating knowledge to a broader audience which can extend it by integrating additional knowledge.

4 Challenges We compile an (incomplete) list of challenges, grouped into five categories, which need to be addressed for realization of the above described vision: 1. Data and information: a) Information quality: Through the advent of the Social Web and user generated content information quality and trustworthiness gain a prominent role. The same is true for user generated knowledge. b) Information reuse and diffusion: In large scale collaboration scenarios tracking diffusion of information and knowledge, and identification of their reuse becomes important. c) Security and ownership: Access control and security mechanisms are hard to address in large, distributed data and knowledge bases. d) Change and evolution: Identification of trends and temporal patterns, and handling of high rates of change in data, content and knowledge are necessary when dealing with dynamically changing repositories. e) Data integration: Binding to different data and knowledge repositories, and transformation of data into the required form becomes indispensable when dealing with heterogeneous infromation. 2. Storage infrastructures: efficiently applying distributed Big Data Storage infrastructures, such as Hadoop/HDFS, and integrating them with traditional infrastructure (relational databases) and with arising models, such as cloud computing. 3. Algorithms: To cope with huge data amounts scalable algorithms need to be developed. Two approaches appear promising in this context: a) Distributed Algorithms. for example based on Hadoop (MapReduce + Google File System), process data on a large number of computing nodes which may be placed in geographically separate locations. b) Streaming Algorithms process a huge stream of information in a few (one) passes and with limited resources (memory), by building an approximate summary/aggregation of the data. 4. User interfaces and visualization: a) Visual analytics: Advanced, scalable visual interfaces are necessary for analysis and exploration of large data and knowledge databases. Usage of semantic descriptors holds the promise for automatically binding to semantically described data. b) Mobility: The mobile boom redefines the requirements on GUI design and interactivity. c) Context: Besides considering the classical user context (e.g. task, preferences), user context can be extended with sensory data, such as those provided by mobile devices. d) Collaboration: New collaboration possibilities arise through increased mobility and permanent broadband network access. 5. Use und Commercialization: For various types of content (such as videos, music, news etc.) established commercial and non-commercial utilization models exist. For linked data this is not the case (yet). Sustainable eco-systems for LOD utilization need to be built and established (possibly learning from the lessons delivered by the social Web and user generated content).

5 Each of the challenges presents a research topic on its own. Our interests have been focusing on topics such as information reuse, quality and evolution, consideration of user context, and visual analytical interfaces (including mobile applications). To bring forward the presented vision and address the challenges in a satisfactory manner, bundling of resources and competencies appears as a natural way to go. References [Bertschi et al. 2011] Bertschi, S., Bresciani, S, Crawford, T., Goebel, R., Kienreich, W., Lindner, M., Sabol, V., Vande Moere, A., (2011): What is Knowledge Visualization? Opinions on Current and Future State, in Proceedings of the 15th International Conference Information Visualisation (IV'11) [Eppler & Burkhard 2004] Eppler, M.J., Burkhard, R.A., Knowledge Visualization Towards a New Discipline and its Fields of Application, ICA Working Paper #2/2004, University of Lugano, Switzerland, [Etzioni et al. 2010] Etzioni, O., Banko, M., Soderland, S., & Weld, D. S. (2008). Open information extraction from the web. Communications of the ACM, 51(12), 68. doi: / [Euzenat & Shvaiko 2007] Euzenat, J., Shvaiko, P., (2007): Ontology matching, Springer- Verlag. [Granitzer et al. 2010] Granitzer, M., Sabol, V., Onn, K.W., Lukose, Dickson., Tochtermann, K. (2010): Ontology Alignment A Survey with Focus on Visually Supported Semi- Automatic Techniques, Future Internet, Volume 2, Issue 3, , MDPI AG [Granitzer et al. 2011] Granitzer, M., Sabol, V., Kienreich, W., Lukose, D., Onn, K.W. (2011): Visual Analyses on Linked Data An Opportunity for both Fields The 2011 STI Semantic Summit, Riga, Latvia [Hotho et al. 2002] Hotho, A., Maedche, A. and Staab, S., (2002): Ontology-based text document clustering, Kunstliche Intelligenz, 16 (4), pp [Keim et al. 2010] Keim, D., Mansmann, F., & Thomas, J. (2010). Visual analytics: how much visualization and how much analytics? ACM SIGKDD Explorations Newsletter, 11(2), 5 8. ACM. [Kern et al. 2010] Kern, R., Muhr, M., Granitzer, M. (2010): KCDC: Word Sense Induction by Using Grammatical Dependencies and Sentence Phrase Structure, in Proceedings of SemEval-2, pages [Kern et al. 2011a] Kern, R., Seifert, C., Zechner, m., Granitzer, M., (2011): Vote/Veto Meta-Classifier for Authorship Identification, 3rd International Competition on Plagiarism Detection [Kern et al. 2011b] Kern, R., Zechner, M., Granitzer, M., (2011): Model Selection Strategies for Author Disambiguation, IEEE Computer Society: 8th International Workshop on Text-based Information Retrieval in Procceedings of 22th International Conference on Database and Expert Systems Applications (DEXA 11), pages [Lengler & Eppler 2007] Lengler R., Eppler M. (2007): Towards A Periodic Table of Visualization Methods for Management. IASTED Proceedings of the Conference on Graphics and Visualization in Engineering (GVE 2007), Clearwater, Florida, USA.

6 [Lex et al. 2010] Lex, E., Khan, I., Bischof, H., Granitzer, M., (2010): Assessing the Quality of Web Content, Proceedings of the ECML/PKDD Discovery Challenge. [Muhr et al. 2010a] Muhr, M., Sabol, V., Granitzer, M. (2010): Scalable Recursive Top- Down Hierarchical Clustering Approach with implicit Model Selection for Textual Data Sets, 7th International Workshop on Text-based Information Retrieval, in Proceedings of 21th International Conference on Database and Expert Systems Applications (DEXA 10), IEEE. [Muhr et al. 2010b] Muhr, M., Roman Kern R., Granitzer, M., (2010): Analysis of Structural Relationships for Hierarchical Cluster Labeling, in Proceeding of the 33rd international ACM SIGIR Conference on Research and Development in information Retrieval, pages , ACM [Paulheim & Probst 2010] Paulheim, H., Probst, F., (2010): Ontology-Enhanced User Interfaces: A Survey, International Journal on Semantic Web and Information Systems (IJSWIS), 6(2). [Sabol et al. 2009] Sabol, V., Kienreich, W., Muhr, M, Klieber, W., Granitzer, M., (2009): Visual Knowledge Discovery in Dynamic Enterprise Text Repositories, Proceedings of the 13th International Conference on Information Visualisation (IV09), IEEE. [Szczuka et al. 2011] Szczuka, M., Janusz, A., Herba, K., (2011): Clustering of rough set related documents with use of knowledge from DBpedia, Proceedings of the 6th international conference on Rough sets and knowledge technology RSKT'11, pages [Thomas & Cook 2005] Thomas, J. J., Cook, K. A. (2005). Illuminating the Path: The Research and Development Agenda for Visual Analytics (p. 186). IEEE Computer Society. Retrieved from

Information Visualization WS 2013/14 11 Visual Analytics

Information Visualization WS 2013/14 11 Visual Analytics 1 11.1 Definitions and Motivation Lot of research and papers in this emerging field: Visual Analytics: Scope and Challenges of Keim et al. Illuminating the path of Thomas and Cook 2 11.1 Definitions and

More information

María Elena Alvarado gnoss.com* elenaalvarado@gnoss.com Susana López-Sola gnoss.com* susanalopez@gnoss.com

María Elena Alvarado gnoss.com* elenaalvarado@gnoss.com Susana López-Sola gnoss.com* susanalopez@gnoss.com Linked Data based applications for Learning Analytics Research: faceted searches, enriched contexts, graph browsing and dynamic graphic visualisation of data Ricardo Alonso Maturana gnoss.com *Piqueras

More information

Semantic Search in Portals using Ontologies

Semantic Search in Portals using Ontologies Semantic Search in Portals using Ontologies Wallace Anacleto Pinheiro Ana Maria de C. Moura Military Institute of Engineering - IME/RJ Department of Computer Engineering - Rio de Janeiro - Brazil [awallace,anamoura]@de9.ime.eb.br

More information

Survey Results: Requirements and Use Cases for Linguistic Linked Data

Survey Results: Requirements and Use Cases for Linguistic Linked Data Survey Results: Requirements and Use Cases for Linguistic Linked Data 1 Introduction This survey was conducted by the FP7 Project LIDER (http://www.lider-project.eu/) as input into the W3C Community Group

More information

How To Make Sense Of Data With Altilia

How To Make Sense Of Data With Altilia HOW TO MAKE SENSE OF BIG DATA TO BETTER DRIVE BUSINESS PROCESSES, IMPROVE DECISION-MAKING, AND SUCCESSFULLY COMPETE IN TODAY S MARKETS. ALTILIA turns Big Data into Smart Data and enables businesses to

More information

Data Visualisation and Statistical Analysis Within the Decision Making Process

Data Visualisation and Statistical Analysis Within the Decision Making Process Data Visualisation and Statistical Analysis Within the Decision Making Process Jamie Mahoney Centre for Educational Research and Development, University of Lincoln, Lincoln, UK. Keywords: Abstract: Data

More information

OLAP Visualization Operator for Complex Data

OLAP Visualization Operator for Complex Data OLAP Visualization Operator for Complex Data Sabine Loudcher and Omar Boussaid ERIC laboratory, University of Lyon (University Lyon 2) 5 avenue Pierre Mendes-France, 69676 Bron Cedex, France Tel.: +33-4-78772320,

More information

Text Mining - Scope and Applications

Text Mining - Scope and Applications Journal of Computer Science and Applications. ISSN 2231-1270 Volume 5, Number 2 (2013), pp. 51-55 International Research Publication House http://www.irphouse.com Text Mining - Scope and Applications Miss

More information

Institute for Information Systems and Computer Media. Graz University of Technology. Phone: (+43) 316-873-5613. Graz University of Technology

Institute for Information Systems and Computer Media. Graz University of Technology. Phone: (+43) 316-873-5613. Graz University of Technology Title: Tag Clouds Name: Christoph Trattner 1 and Denis Helic 2 and Markus Strohmaier 2 Affil./Addr. 1: Knowledge Management Institute and Institute for Information Systems and Computer Media Graz University

More information

DISIT Lab, competence and project idea on bigdata. reasoning

DISIT Lab, competence and project idea on bigdata. reasoning DISIT Lab, competence and project idea on bigdata knowledge modeling, OD/LD and reasoning Paolo Nesi Dipartimento di Ingegneria dell Informazione, DINFO Università degli Studi di Firenze Via S. Marta 3,

More information

The University of Jordan

The University of Jordan The University of Jordan Master in Web Intelligence Non Thesis Department of Business Information Technology King Abdullah II School for Information Technology The University of Jordan 1 STUDY PLAN MASTER'S

More information

Visual Analytics. Daniel A. Keim, Florian Mansmann, Andreas Stoffel, Hartmut Ziegler University of Konstanz, Germany http://infovis.uni-konstanz.

Visual Analytics. Daniel A. Keim, Florian Mansmann, Andreas Stoffel, Hartmut Ziegler University of Konstanz, Germany http://infovis.uni-konstanz. Visual Analytics Daniel A. Keim, Florian Mansmann, Andreas Stoffel, Hartmut Ziegler University of Konstanz, Germany http://infovis.uni-konstanz.de SYNONYMS Visual Analysis; Visual Data Analysis; Visual

More information

Big Data and Analytics: Challenges and Opportunities

Big Data and Analytics: Challenges and Opportunities Big Data and Analytics: Challenges and Opportunities Dr. Amin Beheshti Lecturer and Senior Research Associate University of New South Wales, Australia (Service Oriented Computing Group, CSE) Talk: Sharif

More information

Visualizing the Top 400 Universities

Visualizing the Top 400 Universities Int'l Conf. e-learning, e-bus., EIS, and e-gov. EEE'15 81 Visualizing the Top 400 Universities Salwa Aljehane 1, Reem Alshahrani 1, and Maha Thafar 1 saljehan@kent.edu, ralshahr@kent.edu, mthafar@kent.edu

More information

A Professional Big Data Master s Program to train Computational Specialists

A Professional Big Data Master s Program to train Computational Specialists A Professional Big Data Master s Program to train Computational Specialists Anoop Sarkar, Fred Popowich, Alexandra Fedorova! School of Computing Science! Education for Employable Graduates: Critical Questions

More information

White Paper. Version 1.2 May 2015 RAID Incorporated

White Paper. Version 1.2 May 2015 RAID Incorporated White Paper Version 1.2 May 2015 RAID Incorporated Introduction The abundance of Big Data, structured, partially-structured and unstructured massive datasets, which are too large to be processed effectively

More information

City Data Pipeline. A System for Making Open Data Useful for Cities. stefan.bischof@tuwien.ac.at

City Data Pipeline. A System for Making Open Data Useful for Cities. stefan.bischof@tuwien.ac.at City Data Pipeline A System for Making Open Data Useful for Cities Stefan Bischof 1,2, Axel Polleres 1, and Simon Sperl 1 1 Siemens AG Österreich, Siemensstraße 90, 1211 Vienna, Austria {bischof.stefan,axel.polleres,simon.sperl}@siemens.com

More information

Big Data and Healthcare Payers WHITE PAPER

Big Data and Healthcare Payers WHITE PAPER Knowledgent White Paper Series Big Data and Healthcare Payers WHITE PAPER Summary With the implementation of the Affordable Care Act, the transition to a more member-centric relationship model, and other

More information

Ontology-Based Filtering Mechanisms for Web Usage Patterns Retrieval

Ontology-Based Filtering Mechanisms for Web Usage Patterns Retrieval Ontology-Based Filtering Mechanisms for Web Usage Patterns Retrieval Mariângela Vanzin, Karin Becker, and Duncan Dubugras Alcoba Ruiz Faculdade de Informática - Pontifícia Universidade Católica do Rio

More information

Science 2.0 & Big Data Science 2.0 Conference, Hamburg, March 25, 2015

Science 2.0 & Big Data Science 2.0 Conference, Hamburg, March 25, 2015 Science 2.0 & Big Data Science 2.0 Conference, Hamburg, March 25, 2015 b Prof. Dr. Stefanie Lindstaedt b www.know-center.at Know-Center GmbH Know-Center Austria s Research Center for Data-driven Business

More information

Revealing Trends and Insights in Online Hiring Market Using Linking Open Data Cloud: Active Hiring a Use Case Study

Revealing Trends and Insights in Online Hiring Market Using Linking Open Data Cloud: Active Hiring a Use Case Study Revealing Trends and Insights in Online Hiring Market Using Linking Open Data Cloud: Active Hiring a Use Case Study Amar-Djalil Mezaour 1, Julien Law-To 1, Robert Isele 3, Thomas Schandl 2, and Gerd Zechmeister

More information

可 视 化 与 可 视 计 算 概 论. Introduction to Visualization and Visual Computing 袁 晓 如 北 京 大 学 2015.12.23

可 视 化 与 可 视 计 算 概 论. Introduction to Visualization and Visual Computing 袁 晓 如 北 京 大 学 2015.12.23 可 视 化 与 可 视 计 算 概 论 Introduction to Visualization and Visual Computing 袁 晓 如 北 京 大 学 2015.12.23 2 Visual Analytics Adapted from Jim Thomas s slides 3 Visual Analytics Definition Visual Analytics is the

More information

ICT Perspectives on Big Data: Well Sorted Materials

ICT Perspectives on Big Data: Well Sorted Materials ICT Perspectives on Big Data: Well Sorted Materials 3 March 2015 Contents Introduction 1 Dendrogram 2 Tree Map 3 Heat Map 4 Raw Group Data 5 For an online, interactive version of the visualisations in

More information

Web Mining. Margherita Berardi LACAM. Dipartimento di Informatica Università degli Studi di Bari berardi@di.uniba.it

Web Mining. Margherita Berardi LACAM. Dipartimento di Informatica Università degli Studi di Bari berardi@di.uniba.it Web Mining Margherita Berardi LACAM Dipartimento di Informatica Università degli Studi di Bari berardi@di.uniba.it Bari, 24 Aprile 2003 Overview Introduction Knowledge discovery from text (Web Content

More information

Geo Data Mining and Visual Analytics

Geo Data Mining and Visual Analytics Geo Data Mining and Visual Analytics Beyond Limits Developments in Cadastral Domain Workshop, Zürich 19 March 2015 Susanne Bleisch Institute of Geomatics Engineering School of Architecture, Civil Engineering

More information

A Framework of User-Driven Data Analytics in the Cloud for Course Management

A Framework of User-Driven Data Analytics in the Cloud for Course Management A Framework of User-Driven Data Analytics in the Cloud for Course Management Jie ZHANG 1, William Chandra TJHI 2, Bu Sung LEE 1, Kee Khoon LEE 2, Julita VASSILEVA 3 & Chee Kit LOOI 4 1 School of Computer

More information

A Visualization is Worth a Thousand Tables: How IBM Business Analytics Lets Users See Big Data

A Visualization is Worth a Thousand Tables: How IBM Business Analytics Lets Users See Big Data White Paper A Visualization is Worth a Thousand Tables: How IBM Business Analytics Lets Users See Big Data Contents Executive Summary....2 Introduction....3 Too much data, not enough information....3 Only

More information

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing

More information

Understanding Web personalization with Web Usage Mining and its Application: Recommender System

Understanding Web personalization with Web Usage Mining and its Application: Recommender System Understanding Web personalization with Web Usage Mining and its Application: Recommender System Manoj Swami 1, Prof. Manasi Kulkarni 2 1 M.Tech (Computer-NIMS), VJTI, Mumbai. 2 Department of Computer Technology,

More information

KnowledgeSEEKER Marketing Edition

KnowledgeSEEKER Marketing Edition KnowledgeSEEKER Marketing Edition Predictive Analytics for Marketing The Easiest to Use Marketing Analytics Tool KnowledgeSEEKER Marketing Edition is a predictive analytics tool designed for marketers

More information

SWAP: ONTOLOGY-BASED KNOWLEDGE MANAGEMENT WITH PEER-TO-PEER TECHNOLOGY

SWAP: ONTOLOGY-BASED KNOWLEDGE MANAGEMENT WITH PEER-TO-PEER TECHNOLOGY SWAP: ONTOLOGY-BASED KNOWLEDGE MANAGEMENT WITH PEER-TO-PEER TECHNOLOGY M. EHRIG, C. TEMPICH AND S. STAAB Institute AIFB University of Karlsruhe 76128 Karlsruhe, Germany E-mail: {meh,cte,sst}@aifb.uni-karlsruhe.de

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)

More information

Big Data: Study in Structured and Unstructured Data

Big Data: Study in Structured and Unstructured Data Big Data: Study in Structured and Unstructured Data Motashim Rasool 1, Wasim Khan 2 mail2motashim@gmail.com, khanwasim051@gmail.com Abstract With the overlay of digital world, Information is available

More information

Process Mining in Big Data Scenario

Process Mining in Big Data Scenario Process Mining in Big Data Scenario Antonia Azzini, Ernesto Damiani SESAR Lab - Dipartimento di Informatica Università degli Studi di Milano, Italy antonia.azzini,ernesto.damiani@unimi.it Abstract. In

More information

Chapter ML:XI. XI. Cluster Analysis

Chapter ML:XI. XI. Cluster Analysis Chapter ML:XI XI. Cluster Analysis Data Mining Overview Cluster Analysis Basics Hierarchical Cluster Analysis Iterative Cluster Analysis Density-Based Cluster Analysis Cluster Evaluation Constrained Cluster

More information

Term extraction for user profiling: evaluation by the user

Term extraction for user profiling: evaluation by the user Term extraction for user profiling: evaluation by the user Suzan Verberne 1, Maya Sappelli 1,2, Wessel Kraaij 1,2 1 Institute for Computing and Information Sciences, Radboud University Nijmegen 2 TNO,

More information

Exploration and Visualization of Post-Market Data

Exploration and Visualization of Post-Market Data Exploration and Visualization of Post-Market Data Jianying Hu, PhD Joint work with David Gotz, Shahram Ebadollahi, Jimeng Sun, Fei Wang, Marianthi Markatou Healthcare Analytics Research IBM T.J. Watson

More information

Information Visualisation and Visual Analytics for Governance and Policy Modelling

Information Visualisation and Visual Analytics for Governance and Policy Modelling Information Visualisation and Visual Analytics for Governance and Policy Modelling Jörn Kohlhammer 1, Tobias Ruppert 1, James Davey 1, Florian Mansmann 2, Daniel Keim 2 1 Fraunhofer IGD, Fraunhoferstr.

More information

Big Data: Rethinking Text Visualization

Big Data: Rethinking Text Visualization Big Data: Rethinking Text Visualization Dr. Anton Heijs anton.heijs@treparel.com Treparel April 8, 2013 Abstract In this white paper we discuss text visualization approaches and how these are important

More information

Visualization methods for patent data

Visualization methods for patent data Visualization methods for patent data Treparel 2013 Dr. Anton Heijs (CTO & Founder) Delft, The Netherlands Introduction Treparel can provide advanced visualizations for patent data. This document describes

More information

Visual Analytics: Combining Automated Discovery with Interactive Visualizations

Visual Analytics: Combining Automated Discovery with Interactive Visualizations Visual Analytics: Combining Automated Discovery with Interactive Visualizations Daniel A. Keim, Florian Mansmann, Daniela Oelke, and Hartmut Ziegler University of Konstanz, Germany first.lastname@uni-konstanz.de,

More information

Computer-Based Text- and Data Analysis Technologies and Applications. Mark Cieliebak 9.6.2015

Computer-Based Text- and Data Analysis Technologies and Applications. Mark Cieliebak 9.6.2015 Computer-Based Text- and Data Analysis Technologies and Applications Mark Cieliebak 9.6.2015 Data Scientist analyze Data Library use 2 About Me Mark Cieliebak + Software Engineer & Data Scientist + PhD

More information

Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05

Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05 Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification

More information

UniGR Workshop: Big Data «The challenge of visualizing big data»

UniGR Workshop: Big Data «The challenge of visualizing big data» Dept. ISC Informatics, Systems & Collaboration UniGR Workshop: Big Data «The challenge of visualizing big data» Dr Ir Benoît Otjacques Deputy Scientific Director ISC The Future is Data-based Can we help?

More information

DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support

DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support Rok Rupnik, Matjaž Kukar, Marko Bajec, Marjan Krisper University of Ljubljana, Faculty of Computer and Information

More information

BIG. Big Data Analysis John Domingue (STI International and The Open University) Big Data Public Private Forum

BIG. Big Data Analysis John Domingue (STI International and The Open University) Big Data Public Private Forum Big Data Analysis John Domingue (STI International and The Open University) Project co-funded by the European Commission within the 7th Framework Program (Grant Agreement No. 257943) 1 The Data landscape

More information

What is Visualization? Information Visualization An Overview. Information Visualization. Definitions

What is Visualization? Information Visualization An Overview. Information Visualization. Definitions What is Visualization? Information Visualization An Overview Jonathan I. Maletic, Ph.D. Computer Science Kent State University Visualize/Visualization: To form a mental image or vision of [some

More information

Concept and Project Objectives

Concept and Project Objectives 3.1 Publishable summary Concept and Project Objectives Proactive and dynamic QoS management, network intrusion detection and early detection of network congestion problems among other applications in the

More information

Augmented Search for Web Applications. New frontier in big log data analysis and application intelligence

Augmented Search for Web Applications. New frontier in big log data analysis and application intelligence Augmented Search for Web Applications New frontier in big log data analysis and application intelligence Business white paper May 2015 Web applications are the most common business applications today.

More information

SURVEY REPORT DATA SCIENCE SOCIETY 2014

SURVEY REPORT DATA SCIENCE SOCIETY 2014 SURVEY REPORT DATA SCIENCE SOCIETY 2014 TABLE OF CONTENTS Contents About the Initiative 1 Report Summary 2 Participants Info 3 Participants Expertise 6 Suggested Discussion Topics 7 Selected Responses

More information

Information Visualization and Visual Analytics

Information Visualization and Visual Analytics Information Visualization and Visual Analytics Pekka Wartiainen University of Jyväskylä pekka.wartiainen@jyu.fi 23.4.2014 Outline Objectives Introduction Visual Analytics Information Visualization Our

More information

Big Data and Natural Language: Extracting Insight From Text

Big Data and Natural Language: Extracting Insight From Text An Oracle White Paper October 2012 Big Data and Natural Language: Extracting Insight From Text Table of Contents Executive Overview... 3 Introduction... 3 Oracle Big Data Appliance... 4 Synthesys... 5

More information

Data Mining, Predictive Analytics with Microsoft Analysis Services and Excel PowerPivot

Data Mining, Predictive Analytics with Microsoft Analysis Services and Excel PowerPivot www.etidaho.com (208) 327-0768 Data Mining, Predictive Analytics with Microsoft Analysis Services and Excel PowerPivot 3 Days About this Course This course is designed for the end users and analysts that

More information

Using Semantic Data Mining for Classification Improvement and Knowledge Extraction

Using Semantic Data Mining for Classification Improvement and Knowledge Extraction Using Semantic Data Mining for Classification Improvement and Knowledge Extraction Fernando Benites and Elena Sapozhnikova University of Konstanz, 78464 Konstanz, Germany. Abstract. The objective of this

More information

MULTI AGENT-BASED DISTRIBUTED DATA MINING

MULTI AGENT-BASED DISTRIBUTED DATA MINING MULTI AGENT-BASED DISTRIBUTED DATA MINING REECHA B. PRAJAPATI 1, SUMITRA MENARIA 2 Department of Computer Science and Engineering, Parul Institute of Technology, Gujarat Technology University Abstract:

More information

International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 ISSN 2229-5518

International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 ISSN 2229-5518 International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 Over viewing issues of data mining with highlights of data warehousing Rushabh H. Baldaniya, Prof H.J.Baldaniya,

More information

MLg. Big Data and Its Implication to Research Methodologies and Funding. Cornelia Caragea TARDIS 2014. November 7, 2014. Machine Learning Group

MLg. Big Data and Its Implication to Research Methodologies and Funding. Cornelia Caragea TARDIS 2014. November 7, 2014. Machine Learning Group Big Data and Its Implication to Research Methodologies and Funding Cornelia Caragea TARDIS 2014 November 7, 2014 UNT Computer Science and Engineering Data Everywhere Lots of data is being collected and

More information

Big Data Integration: A Buyer's Guide

Big Data Integration: A Buyer's Guide SEPTEMBER 2013 Buyer s Guide to Big Data Integration Sponsored by Contents Introduction 1 Challenges of Big Data Integration: New and Old 1 What You Need for Big Data Integration 3 Preferred Technology

More information

Politecnico di Torino. Porto Institutional Repository

Politecnico di Torino. Porto Institutional Repository Politecnico di Torino Porto Institutional Repository [Proceeding] NEMICO: Mining network data through cloud-based data mining techniques Original Citation: Baralis E.; Cagliero L.; Cerquitelli T.; Chiusano

More information

From Data to Foresight:

From Data to Foresight: Laura Haas, IBM Fellow IBM Research - Almaden From Data to Foresight: Leveraging Data and Analytics for Materials Research 1 2011 IBM Corporation The road from data to foresight is long? Consumer Reports

More information

COMP9321 Web Application Engineering

COMP9321 Web Application Engineering COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411

More information

Why big data? Lessons from a Decade+ Experiment in Big Data

Why big data? Lessons from a Decade+ Experiment in Big Data Why big data? Lessons from a Decade+ Experiment in Big Data David Belanger PhD Senior Research Fellow Stevens Institute of Technology dbelange@stevens.edu 1 What Does Big Look Like? 7 Image Source Page:

More information

A GENERAL TAXONOMY FOR VISUALIZATION OF PREDICTIVE SOCIAL MEDIA ANALYTICS

A GENERAL TAXONOMY FOR VISUALIZATION OF PREDICTIVE SOCIAL MEDIA ANALYTICS A GENERAL TAXONOMY FOR VISUALIZATION OF PREDICTIVE SOCIAL MEDIA ANALYTICS Stacey Franklin Jones, D.Sc. ProTech Global Solutions Annapolis, MD Abstract The use of Social Media as a resource to characterize

More information

Connecting library content using data mining and text analytics on structured and unstructured data

Connecting library content using data mining and text analytics on structured and unstructured data Submitted on: May 5, 2013 Connecting library content using data mining and text analytics on structured and unstructured data Chee Kiam Lim Technology and Innovation, National Library Board, Singapore.

More information

Media Watch on Climate Change. Geospatial Web Technology for Accessing Environmental Online Resources http://www.ecoresearch.

Media Watch on Climate Change. Geospatial Web Technology for Accessing Environmental Online Resources http://www.ecoresearch. Media Watch on Climate Change Geospatial Web Technology for Accessing Environmental Online Resources http://www.ecoresearch.net/climate IDIOM Project Scientific Partner: MODUL University Vienna Technical

More information

Keywords: Big Data, HDFS, Map Reduce, Hadoop

Keywords: Big Data, HDFS, Map Reduce, Hadoop Volume 5, Issue 7, July 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Configuration Tuning

More information

Visual Analysis of Statistical Data on Maps using Linked Open Data

Visual Analysis of Statistical Data on Maps using Linked Open Data Visual Analysis of Statistical Data on Maps using Linked Open Data Petar Ristoski and Heiko Paulheim University of Mannheim, Germany Research Group Data and Web Science {petar.ristoski,heiko}@informatik.uni-mannheim.de

More information

Big Data Mining: Challenges and Opportunities to Forecast Future Scenario

Big Data Mining: Challenges and Opportunities to Forecast Future Scenario Big Data Mining: Challenges and Opportunities to Forecast Future Scenario Poonam G. Sawant, Dr. B.L.Desai Assist. Professor, Dept. of MCA, SIMCA, Savitribai Phule Pune University, Pune, Maharashtra, India

More information

Find the signal in the noise

Find the signal in the noise Find the signal in the noise Electronic Health Records: The challenge The adoption of Electronic Health Records (EHRs) in the USA is rapidly increasing, due to the Health Information Technology and Clinical

More information

TYPE OF PRESENTATION PROPOSED: Research contribution

TYPE OF PRESENTATION PROPOSED: Research contribution PAPER SUBMISSION FOR EDF 2014 TITLE OF PRESENTATION: VALCRI: Addressing European Needs for Information Exploitation of Large Complex Data in Criminal Intelligence Analysis SUMMARY OF THE PRESENTATION This

More information

Conquering the Astronomical Data Flood through Machine

Conquering the Astronomical Data Flood through Machine Conquering the Astronomical Data Flood through Machine Learning and Citizen Science Kirk Borne George Mason University School of Physics, Astronomy, & Computational Sciences http://spacs.gmu.edu/ The Problem:

More information

How To Find Influence Between Two Concepts In A Network

How To Find Influence Between Two Concepts In A Network 2014 UKSim-AMSS 16th International Conference on Computer Modelling and Simulation Influence Discovery in Semantic Networks: An Initial Approach Marcello Trovati and Ovidiu Bagdasar School of Computing

More information

Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance.

Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance. Volume 4, Issue 11, November 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Analytics

More information

Cloud computing based big data ecosystem and requirements

Cloud computing based big data ecosystem and requirements Cloud computing based big data ecosystem and requirements Yongshun Cai ( 蔡 永 顺 ) Associate Rapporteur of ITU T SG13 Q17 China Telecom Dong Wang ( 王 东 ) Rapporteur of ITU T SG13 Q18 ZTE Corporation Agenda

More information

User Needs and Requirements Analysis for Big Data Healthcare Applications

User Needs and Requirements Analysis for Big Data Healthcare Applications User Needs and Requirements Analysis for Big Data Healthcare Applications Sonja Zillner, Siemens AG In collaboration with: Nelia Lasierra, Werner Faix, and Sabrina Neururer MIE 2014 in Istanbul: 01-09-2014

More information

Societal Data Resources and Data Processing Infrastructure

Societal Data Resources and Data Processing Infrastructure Societal Data Resources and Data Processing Infrastructure Bruno Martins INESC-ID & Instituto Superior Técnico bruno.g.martins@ist.utl.pt 1 DATASTORM Task on Societal Data Project vision : Build infrastructure

More information

Unleashing Semantics of Research Data

Unleashing Semantics of Research Data Unleashing Semantics of Research Data Florian Stegmaier 1, Christin Seifert 1, Roman Kern 2, Patrick Höfler 2, Sebastian Bayerl 1, Michael Granitzer 1, Harald Kosch 1, Stefanie Lindstaedt 2, Belgin Mutlu

More information

Some Research Challenges for Big Data Analytics of Intelligent Security

Some Research Challenges for Big Data Analytics of Intelligent Security Some Research Challenges for Big Data Analytics of Intelligent Security Yuh-Jong Hu hu at cs.nccu.edu.tw Emerging Network Technology (ENT) Lab. Department of Computer Science National Chengchi University,

More information

Scalable End-User Access to Big Data http://www.optique-project.eu/ HELLENIC REPUBLIC National and Kapodistrian University of Athens

Scalable End-User Access to Big Data http://www.optique-project.eu/ HELLENIC REPUBLIC National and Kapodistrian University of Athens Scalable End-User Access to Big Data http://www.optique-project.eu/ HELLENIC REPUBLIC National and Kapodistrian University of Athens 1 Optique: Improving the competitiveness of European industry For many

More information

FOUNDATIONS OF A CROSS- DISCIPLINARY PEDAGOGY FOR BIG DATA

FOUNDATIONS OF A CROSS- DISCIPLINARY PEDAGOGY FOR BIG DATA FOUNDATIONS OF A CROSSDISCIPLINARY PEDAGOGY FOR BIG DATA Joshua Eckroth Stetson University DeLand, Florida 3867402519 jeckroth@stetson.edu ABSTRACT The increasing awareness of big data is transforming

More information

Mastering Big Data. Steve Hoskin, VP and Chief Architect INFORMATICA MDM. October 2015

Mastering Big Data. Steve Hoskin, VP and Chief Architect INFORMATICA MDM. October 2015 Mastering Big Data Steve Hoskin, VP and Chief Architect INFORMATICA MDM October 2015 Agenda About Big Data MDM and Big Data The Importance of Relationships Big Data Use Cases About Big Data Big Data is

More information

www.pwc.com Implementation of Big Data and Analytics Projects with Big Data Discovery and BICS March 2015

www.pwc.com Implementation of Big Data and Analytics Projects with Big Data Discovery and BICS March 2015 www.pwc.com Implementation of Big Data and Analytics Projects with Big Data Discovery and BICS Agenda Big Data Discovery Oracle Business Intelligence Cloud Services (BICS) Use Cases How to start and our

More information

Big Data on Microsoft Platform

Big Data on Microsoft Platform Big Data on Microsoft Platform Prepared by GJ Srinivas Corporate TEG - Microsoft Page 1 Contents 1. What is Big Data?...3 2. Characteristics of Big Data...3 3. Enter Hadoop...3 4. Microsoft Big Data Solutions...4

More information

Automatic Annotation Wrapper Generation and Mining Web Database Search Result

Automatic Annotation Wrapper Generation and Mining Web Database Search Result Automatic Annotation Wrapper Generation and Mining Web Database Search Result V.Yogam 1, K.Umamaheswari 2 1 PG student, ME Software Engineering, Anna University (BIT campus), Trichy, Tamil nadu, India

More information

Elsa C. Augustenborg Gary R. Danielson Andrew E. Beck

Elsa C. Augustenborg Gary R. Danielson Andrew E. Beck Elsa C. Augustenborg Gary R. Danielson Andrew E. Beck Pacific Northwest National Laboratory PNNL-SA-75867 Overview Technical challenges Institutional challenges Architectural approach Examples: Promising

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION 1 CHAPTER 1 INTRODUCTION Exploration is a process of discovery. In the database exploration process, an analyst executes a sequence of transformations over a collection of data structures to discover useful

More information

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the

More information

72. Ontology Driven Knowledge Discovery Process: a proposal to integrate Ontology Engineering and KDD

72. Ontology Driven Knowledge Discovery Process: a proposal to integrate Ontology Engineering and KDD 72. Ontology Driven Knowledge Discovery Process: a proposal to integrate Ontology Engineering and KDD Paulo Gottgtroy Auckland University of Technology Paulo.gottgtroy@aut.ac.nz Abstract This paper is

More information

Meet AkzoNobel Leading market positions delivering leading performance

Meet AkzoNobel Leading market positions delivering leading performance Meet AkzoNobel Leading market positions delivering leading performance BI Thema dag VNSG A Business Intelligence journey John Wenmakers AkzoNobel Leon Huijsmans Interdobs December 10, 2013 Agenda A Business

More information

Manjula Ambur NASA Langley Research Center April 2014

Manjula Ambur NASA Langley Research Center April 2014 Manjula Ambur NASA Langley Research Center April 2014 Outline What is Big Data Vision and Roadmap Key Capabilities Impetus for Watson Technologies Content Analytics Use Potential use cases What is Big

More information

The Big Data Paradigm Shift. Insight Through Automation

The Big Data Paradigm Shift. Insight Through Automation The Big Data Paradigm Shift Insight Through Automation Agenda The Problem Emcien s Solution: Algorithms solve data related business problems How Does the Technology Work? Case Studies 2013 Emcien, Inc.

More information

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM. DATA MINING TECHNOLOGY Georgiana Marin 1 Abstract In terms of data processing, classical statistical models are restrictive; it requires hypotheses, the knowledge and experience of specialists, equations,

More information

Visualizing Repertory Grid Data for Formative Assessment

Visualizing Repertory Grid Data for Formative Assessment Visualizing Repertory Grid Data for Formative Assessment Kostas Pantazos 1, Ravi Vatrapu 1, 2 and Abid Hussain 1 1 Computational Social Science Laboratory (CSSL) Department of IT Management, Copenhagen

More information

Linked Open Data A Way to Extract Knowledge from Global Datastores

Linked Open Data A Way to Extract Knowledge from Global Datastores Linked Open Data A Way to Extract Knowledge from Global Datastores Bebo White SLAC National Accelerator Laboratory HKU Expert Address 18 September 2014 Developments in science and information processing

More information

Data Refinery with Big Data Aspects

Data Refinery with Big Data Aspects International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 655-662 International Research Publications House http://www. irphouse.com /ijict.htm Data

More information

Visualization. Program visualization

Visualization. Program visualization Visualization Program visualization Debugging programs without the aid of support tools can be extremely difficult. See My Hairest Bug War Stories, Marc Eisenstadt, Communications of the ACM, Vol 40, No

More information

Transforming the Telecoms Business using Big Data and Analytics

Transforming the Telecoms Business using Big Data and Analytics Transforming the Telecoms Business using Big Data and Analytics Event: ICT Forum for HR Professionals Venue: Meikles Hotel, Harare, Zimbabwe Date: 19 th 21 st August 2015 AFRALTI 1 Objectives Describe

More information

Knowledge Discovery from patents using KMX Text Analytics

Knowledge Discovery from patents using KMX Text Analytics Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs anton.heijs@treparel.com Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers

More information

Big Data Challenges and Success Factors. Deloitte Analytics Your data, inside out

Big Data Challenges and Success Factors. Deloitte Analytics Your data, inside out Big Data Challenges and Success Factors Deloitte Analytics Your data, inside out Big Data refers to the set of problems and subsequent technologies developed to solve them that are hard or expensive to

More information

International Journal of Innovative Research in Computer and Communication Engineering

International Journal of Innovative Research in Computer and Communication Engineering FP Tree Algorithm and Approaches in Big Data T.Rathika 1, J.Senthil Murugan 2 Assistant Professor, Department of CSE, SRM University, Ramapuram Campus, Chennai, Tamil Nadu,India 1 Assistant Professor,

More information