KNIME Open Source Days Sep 3 7, Konstanz, Germany

Size: px
Start display at page:

Download "KNIME Open Source Days 2012. Sep 3 7, Konstanz, Germany"

Transcription

1 KNIME Open Source Days 2012 Sep 3 7, Konstanz, Germany

2 Mind Era - who are we? Mind Eratosthenes Kft., Budapest, Hungary, mind-era.com Katalin Bakos CEO, sister Gábor Bakos mathematician, software engineer, brother KOS Days 2012

3 RapidMiner, HiTS what is it? RapidMiner: Another Open Source Framework for data mining We integrated it to KNIME, it works like a metanode HiTS - some nodes to help data analysis of High Throughput/Content Screenings Contains nodes to perform cellhts2 transformations, visualize data, transform data, and a failed experiment to handle/search images using Bio-Formats KOS Days 2012

4 RapidMiner, HiTS - highlights RapidMiner Node Allows to execute/edit RapidMiner workflows (processes) RapidMiner Viewer Node Helps visualize data Hits nodes Leaf ordering, Reverse Order, Sort by Cluster, Dendrogram with Heatmap, Simple Heatmap,Rank, Direct Product, Merge (kind of antisort), Pivot, Unpivot, Subsets, KOS Days 2012

5 STARK Joint initiative KNIME + PASCAL2 Prof José L Balcázar (UC, now UPC) Proposer and part time programmer Personnel from Universidad de Cantabria Javier de la Dehesa (senior undergrad, now grad student, coded most of it) Diego García-Sáiz (grad student) Cristina Tîrnauca (post-doc) KOS Days 2012

6 STARK what is it? Self-Tuning Association Rules for KNIME KNIME node that performs association rule mining with very low configuration needs Tuning support and choosing rule interest measures are very difficult tasks for end users We proposed a self-tuning approach Decreasing support traversal, confidence boost Prototype in Python: yacaree.sf.net Now: Porting it into KNIME Will try to sell it to you all these days... KOS Days 2012

7 Current status Yacaree Node exists now The confidence boost handling needs a bit of improvement The usage is a bit complicated BUT: the Python version went ahead The KNIME node is a bit behind Algorithms have advanced even further conceptually Trying to catch up this week! KOS Days 2012

8 GenericWorkflowNodes for SeqAn and OpenMS Freie Universität Berlin Prof. Knut Reinert Head of Algorithmic Bioinformatics group Stephan Aiche Research Associate Björn Kahlert Research Associate KOS Days 2012

9 GenericWorkflowNodes for SeqAn/OpenMS what is it? GenericWorkflowNodes Wrap existing tools into KNIME nodes Seqan/OpenMS Open Source Frameworks for sequence analysis and analysis of mass spectrometry data Developed at Freie Universität Berlin (SeqAn, OpenMS) and Universität Tübingen (OpenMS) KOS Days 2012

10 GenericWorkflowNodes - highlights SeqAn/OpenMS Nodes most OpenMS and SeqAn apps available in KNIME CTD (Common Tool Description) a generic XML based description of command line tools Translate any tool you need into a KNIME node based on a CTD for the tool KOS Days 2012

11 Cortana - who are we? Leiden University Arno Knobbe Post-doc, occational programmer Marvin Meeng Main delevoper Wouter Duivesteijn, Michael Mampaey, Rob Konijn KOS Days 2012

12 Cortana what is it? Modern Subgroup Discovery tool Developed at Leiden University Research vehical to address problems in Subgroup Discovery Analyses tool used in many domains Bank transaction data Bioinformatics (Genomics/ Metabolimics) Chemical drug compound efficacy KOS Days 2012

13 Cortana - highlights Generic SD algorithm Target Type/ Quality Measure Search conditions/ Seach strategy Visualisation and manipulation of both Data and Results Table, Histogram, Scatter plot, DAG Change data type, missing values Subgroup inspection, ROC plots KOS Days 2012

14

15 Palladian KNIME Open Source Days, Konstanz Klemens Muthmann, TU Dresden

16 About Us Information retrieval team, Lehrstuhl Rechnernetze, TU Dresden Klemens Muthmann Philipp Katz David Urbansky (o. Abb.)

17 About Palladian Java-based toolkit for information retrieval Provide users with a basic set of tools Palladian s strengths Text classification, feed reading, named entity recognition, date recognition, keyword extraction, content scraping

18 Highlights Palladian text classifier for sentiment analysis

19 Highlights

20 PMM Lab - who are we? Federal Institute for Risk Assessment Germany Christian Thöns Programmer and Research Assistent And others (Matthias Filter, Jörgen Brandt, Armin Weiser, Alexander Falenski) KOS Days 2012

21 PMM Lab what is it? Collection of KNIME nodes for Predictive Microbiology Developed at the Federal Institute for Risk Assessment since 2011 Provides nodes for fitting and visualizing Predictive Microbiology models KOS Days 2012

22 KNIME - highlights Views for PMM models and data User can enter new models (model equations are parsed with JEP) KOS Days 2012

23 Who are we? University of Tübingen / Applied Bioinforma9cs group Exper9se in proteomics/metabolomics, drug design, molecular modelling, sequence analysis, systems biology and immunoinforma9cs Prof. Oliver Kohlbacher Head of Applied Bioinforma9cs Group Luis de la Garza PhD Student Kohlbacher, de la Garza Applied Bioinforma3cs Group 1

24 What do we do? Workflows on grid systems Integra9on of Computer Aided Drug Design Suite (CADDSuite) and OpenMS as KNIME nodes GenericKnimeNodes development together with FU Berlin GenericKnimeNodes Kohlbacher, de la Garza Applied Bioinforma3cs Group 2

25 Highlights - CADDSuite Flexible and open workflow- enabled framework for computer- aided drug design Part of the Biochemical Algorithms Library (BALL) Project Offers solu9ons to common tasks in drug design such as file format conversion, molecule prepara9on, docking, etc. Kohlbacher, de la Garza Applied Bioinforma3cs Group 3

26 Highlights - OpenMS Open mass spectrometry / liquid chromatography C++ library Offers visualiza9on of data, proteomics pipelining, workflow modeling engine, signal processing, feature finding, etc. Kohlbacher, de la Garza Applied Bioinforma3cs Group 4

27 KNIME Open Source Days 2012 Who are we Robert Bosch GmbH, DS/ETM Alexander Warta Test Engineer, Student Tutor Robert Bosch GmbH Diesel Systems, Engineering Test Methods (DS/ETM1) Computer Science Students (Master) Markus John (05/ /2012) two other students 1 Diesel Systems DS/ETM1-Wr, -Jo Robert Bosch GmbH Alle Rechte vorbehalten, auch bzgl. jeder Verfügung, Verwertung, Reproduktion, Bearbeitung, Weitergabe sowie für den Fall von Schutzrechtsanmeldungen.

28 KNIME Open Source Days 2012 Why KNIME Context and Challenge In order to design diesel fuel injection systems for global markets Robert Bosch GmbH considers a lot of specific diesel fuel quality parameters of various markets For this, fuel samples from almost all countries are chemically analyzed by a service provider regularly so-called fuel surveys One survey sample record contains up to 140 attributes, e.g. date, town, country, supplier and the results of chemical and physical analysis like sulfur content, density, viscosity, biodiesel content etc. About records are currently of relevance The previous process integrated Microsoft Excel (plots, histograms, etc.) and PowerPoint (world map) in a non-automated succession This procedure is quite time consuming, not interactive, inflexible and not scalable 2 Diesel Systems DS/ETM1-Wr, -Jo Robert Bosch GmbH Alle Rechte vorbehalten, auch bzgl. jeder Verfügung, Verwertung, Reproduktion, Bearbeitung, Weitergabe sowie für den Fall von Schutzrechtsanmeldungen.

29 KNIME Open Source Days 2012 Why KNIME Catalog of Requirements extract knowledge through interactive exploration easy access to all fuel surveys with filter methods generate choropleth maps and cartograms show country names show additional diagrams for each country show only selected countries enrich map with external data (like cities of the fuel survey records, locations of oil refineries, etc.) generate star plots, parallel coordinates, scatterplots apply data mining algorithms for finding new patterns between instances and features (like association rule learning, hierarchical clustering, multidimensional scaling) enrich fuel survey data with external data (like new diesel car registrations, failure count of the common rail system, etc.) 3 Diesel Systems DS/ETM1-Wr, -Jo Robert Bosch GmbH Alle Rechte vorbehalten, auch bzgl. jeder Verfügung, Verwertung, Reproduktion, Bearbeitung, Weitergabe sowie für den Fall von Schutzrechtsanmeldungen.

30 KNIME Open Source Days 2012 Highlights KNIME Node GenericWorldMap generating world maps based on statistical attributes, additional dimensions with bars and scalable icons 4 Diesel Systems DS/ETM1-Wr, -Jo Robert Bosch GmbH Alle Rechte vorbehalten, auch bzgl. jeder Verfügung, Verwertung, Reproduktion, Bearbeitung, Weitergabe sowie für den Fall von Schutzrechtsanmeldungen.

31 KNIME Open Source Days 2012 Highlights KNIME Node FuelSurveyVisualizer generating boxplots, starplots, etc. interactively by integrating R 5 Diesel Systems DS/ETM1-Wr, -Jo Robert Bosch GmbH Alle Rechte vorbehalten, auch bzgl. jeder Verfügung, Verwertung, Reproduktion, Bearbeitung, Weitergabe sowie für den Fall von Schutzrechtsanmeldungen.

32 KNIME Open Source Days 2012 Highlights KNIME Node FuelSurveyStandardAnalysis creating standard presentation slides automatically by integrating Apache POI and R 6 Diesel Systems DS/ETM1-Wr, -Jo Robert Bosch GmbH Alle Rechte vorbehalten, auch bzgl. jeder Verfügung, Verwertung, Reproduktion, Bearbeitung, Weitergabe sowie für den Fall von Schutzrechtsanmeldungen.

33 KNIME Open Source Days 2012 Highlights KNIME Node FuelSurveyWarnSystem early warning system to identify worsening fuel quality fast by integrating JBoss Drools (rule-based system) and Apache POI (generating Excel- and Word-file output) ongoing 7 Diesel Systems DS/ETM1-Wr, -Jo Robert Bosch GmbH Alle Rechte vorbehalten, auch bzgl. jeder Verfügung, Verwertung, Reproduktion, Bearbeitung, Weitergabe sowie für den Fall von Schutzrechtsanmeldungen.

34 KNIME Open Source Days 2012 Developed KNIME Nodes Selection Preprocessing Transformation FuelSurveyReader FuelSurveyDeleter StandardAnalysisXML Modeling Neighbors LocalOutlierDetection DistanceBasedkMeans Percentizer RefineryReader Visualization GenericWorldMap FuelSurveyVisualizer StandardAnalysis LocationTransformer DynamicColumnFilter MultipleReference RowFilter LoopColumnToVariable Elbow FuelSurveyWarnSystem FuelSurveyWarnSystemXML 8 Diesel Systems DS/ETM1-Wr, -Jo Robert Bosch GmbH Alle Rechte vorbehalten, auch bzgl. jeder Verfügung, Verwertung, Reproduktion, Bearbeitung, Weitergabe sowie für den Fall von Schutzrechtsanmeldungen.

35 Who are we? Christian Dietz Image Processing Martin Horn Image Processing Tobias Kötter Network Mining Michael Zinsmaier Image Processing KOS Days 2012

36 Our Projects (1/2) Network Mining Framework to process attributed graphs Supports (un)directed, (un)weighted (hyper/multi/k-partite) graphs Indexing & Searching High-performance indexing and advanced querying Bases on Apache Lucene KOS Days 2012

37 Our Projects (2/2) Image Processing and Analysis Extension to process and analyse multidimensional images Integrates state-of-the-art libraries ImgLib2 BioFormats ImageJ ImageJ2 OMERO KOS Days 2012

38 KNIME Iris Adä Modular Data Generation, Ensemble Methods, JFreeChart Zaenal Akbar Parallel Data Mining Violeta Ivanova Parallel Data Mining Sebastian Peter Web Analytics. JFreeChart KNIME Open Source Days 18

39 KNIME Dawid Piatek Statistics Guru Thorsten Meinl Optimization & Build System Thomas Gabriel Database Connectors & R Peter Ohl File Reader & Server Development KNIME Open Source Days 19

40 KNIME Bernd Wiswedel Data Handling Aaron Hart (Magic) Support Michael Berthold The Godfather KNIME Open Source Days 20

41 KNIME Heather Fyson Keeps everything running Peter Burger System Administrator & BBQ master KNIME Open Source Days 21

Additional Information about RFQ for EM-motive

Additional Information about RFQ for EM-motive Additional Information about RFQ for EM-motive Dear Ladies and Gentleman, With this presentation we would like to introduce you to our special situation in purchasing: The EM- motive GmbH (50/50- Joint

More information

Technical Report. The KNIME Text Processing Feature:

Technical Report. The KNIME Text Processing Feature: Technical Report The KNIME Text Processing Feature: An Introduction Dr. Killian Thiel Dr. Michael Berthold Killian.Thiel@uni-konstanz.de Michael.Berthold@uni-konstanz.de Copyright 2012 by KNIME.com AG

More information

Ensembles and PMML in KNIME

Ensembles and PMML in KNIME Ensembles and PMML in KNIME Alexander Fillbrunn 1, Iris Adä 1, Thomas R. Gabriel 2 and Michael R. Berthold 1,2 1 Department of Computer and Information Science Universität Konstanz Konstanz, Germany First.Last@Uni-Konstanz.De

More information

FOR RISK ASSESSMENT FEDERAL INSTITUTE

FOR RISK ASSESSMENT FEDERAL INSTITUTE FEDERAL INSTITUTE FOR RISK ASSESSMENT PMM-Lab - an open source community resource for creating, collecting, sharing and applying predictive microbial models (PMM) Matthias Filter, Christian Thöns, Jörgen

More information

Anomaly Detection and Predictive Maintenance

Anomaly Detection and Predictive Maintenance Anomaly Detection and Predictive Maintenance Rosaria Silipo Iris Adae Christian Dietz Phil Winters Rosaria.Silipo@knime.com Iris.Adae@uni-konstanz.de Christian.Dietz@uni-konstanz.de Phil.Winters@knime.com

More information

Didacticiel Études de cas. Association Rules mining with Tanagra, R (arules package), Orange, RapidMiner, Knime and Weka.

Didacticiel Études de cas. Association Rules mining with Tanagra, R (arules package), Orange, RapidMiner, Knime and Weka. 1 Subject Association Rules mining with Tanagra, R (arules package), Orange, RapidMiner, Knime and Weka. This document extends a previous tutorial dedicated to the comparison of various implementations

More information

KNIME Enterprise server usage and global deployment at NIBR

KNIME Enterprise server usage and global deployment at NIBR KNIME Enterprise server usage and global deployment at NIBR Gregory Landrum, Ph.D. NIBR Informatics Novartis Institutes for BioMedical Research, Basel 8 th KNIME Users Group Meeting Berlin, 26 February

More information

What s Cooking in KNIME

What s Cooking in KNIME What s Cooking in KNIME Thomas Gabriel Copyright 2015 KNIME.com AG Agenda Querying NoSQL Databases Database Improvements & Big Data Copyright 2015 KNIME.com AG 2 Querying NoSQL Databases MongoDB & CouchDB

More information

Data Analysis in E-Learning System of Gunadarma University by Using Knime

Data Analysis in E-Learning System of Gunadarma University by Using Knime Data Analysis in E-Learning System of Gunadarma University by Using Knime Dian Kusuma Ningtyas tyaz tyaz tyaz@student.gunadarma.ac.id Prasetiyo prasetiyo@student.gunadarma.ac.id Farah Virnawati virtha

More information

Geo-Localization of KNIME Downloads

Geo-Localization of KNIME Downloads Geo-Localization of KNIME Downloads as a static report and as a movie Thorsten Meinl Peter Ohl Christian Dietz Martin Horn Bernd Wiswedel Rosaria Silipo Thorsten.Meinl@knime.com Peter.Ohl@knime.com Christian.Dietz@uni-konstanz.de

More information

2015 Workshops for Professors

2015 Workshops for Professors SAS Education Grow with us Offered by the SAS Global Academic Program Supporting teaching, learning and research in higher education 2015 Workshops for Professors 1 Workshops for Professors As the market

More information

#jenkinsconf. Jenkins as a Scientific Data and Image Processing Platform. Jenkins User Conference Boston #jenkinsconf

#jenkinsconf. Jenkins as a Scientific Data and Image Processing Platform. Jenkins User Conference Boston #jenkinsconf Jenkins as a Scientific Data and Image Processing Platform Ioannis K. Moutsatsos, Ph.D., M.SE. Novartis Institutes for Biomedical Research www.novartis.com June 18, 2014 #jenkinsconf Life Sciences are

More information

Interactive Data Mining and Visualization

Interactive Data Mining and Visualization Interactive Data Mining and Visualization Zhitao Qiu Abstract: Interactive analysis introduces dynamic changes in Visualization. On another hand, advanced visualization can provide different perspectives

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)

More information

Consumption of OData Services of Open Items Analytics Dashboard using SAP Predictive Analysis

Consumption of OData Services of Open Items Analytics Dashboard using SAP Predictive Analysis Consumption of OData Services of Open Items Analytics Dashboard using SAP Predictive Analysis (Version 1.17) For validation Document version 0.1 7/7/2014 Contents What is SAP Predictive Analytics?... 3

More information

Data Mining & Data Stream Mining Open Source Tools

Data Mining & Data Stream Mining Open Source Tools Data Mining & Data Stream Mining Open Source Tools Darshana Parikh, Priyanka Tirkha Student M.Tech, Dept. of CSE, Sri Balaji College Of Engg. & Tech, Jaipur, Rajasthan, India Assistant Professor, Dept.

More information

SQL Server Administrator Introduction - 3 Days Objectives

SQL Server Administrator Introduction - 3 Days Objectives SQL Server Administrator Introduction - 3 Days INTRODUCTION TO MICROSOFT SQL SERVER Exploring the components of SQL Server Identifying SQL Server administration tasks INSTALLING SQL SERVER Identifying

More information

KNIME opens the Doors to Big Data. A Practical example of Integrating any Big Data Platform into KNIME

KNIME opens the Doors to Big Data. A Practical example of Integrating any Big Data Platform into KNIME KNIME opens the Doors to Big Data A Practical example of Integrating any Big Data Platform into KNIME Tobias Koetter Rosaria Silipo Tobias.Koetter@knime.com Rosaria.Silipo@knime.com 1 Table of Contents

More information

Radoop: Analyzing Big Data with RapidMiner and Hadoop

Radoop: Analyzing Big Data with RapidMiner and Hadoop Radoop: Analyzing Big Data with RapidMiner and Hadoop Zoltán Prekopcsák, Gábor Makrai, Tamás Henk, Csaba Gáspár-Papanek Budapest University of Technology and Economics, Hungary Abstract Working with large

More information

The Scientific Data Mining Process

The Scientific Data Mining Process Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In

More information

IT services for analyses of various data samples

IT services for analyses of various data samples IT services for analyses of various data samples Ján Paralič, František Babič, Martin Sarnovský, Peter Butka, Cecília Havrilová, Miroslava Muchová, Michal Puheim, Martin Mikula, Gabriel Tutoky Technical

More information

Design Considerations for a More Efficient Power Unit Circuit

Design Considerations for a More Efficient Power Unit Circuit Design Considerations for a More Efficient Power Unit Circuit Tom Shickel Manager Marine & Offshore Bosch Rexroth Corporation Tele: 610/ 694-8552 Fax: 610/ 694-8266 1 Hydraulic Drive System Advantages

More information

Lavastorm Analytic Library Predictive and Statistical Analytics Node Pack FAQs

Lavastorm Analytic Library Predictive and Statistical Analytics Node Pack FAQs 1.1 Introduction Lavastorm Analytic Library Predictive and Statistical Analytics Node Pack FAQs For brevity, the Lavastorm Analytics Library (LAL) Predictive and Statistical Analytics Node Pack will be

More information

Fuzzy Logic in KNIME Modules for Approximate Reasoning

Fuzzy Logic in KNIME Modules for Approximate Reasoning International Journal of Computational Intelligence Systems, Vol. 6, Supplement 1 (2013), 34-45 Fuzzy Logic in KNIME Modules for Approximate Reasoning Michael R. Berthold 1, Bernd Wiswedel 2, and Thomas

More information

DATA MINING ALPHA MINER

DATA MINING ALPHA MINER DATA MINING ALPHA MINER AlphaMiner is developed by the E-Business Technology Institute (ETI) of the University of Hong Kong under the support from the Innovation and Technology Fund (ITF) of the Government

More information

Decision Support Optimization through Predictive Analytics - Leuven Statistical Day 2010

Decision Support Optimization through Predictive Analytics - Leuven Statistical Day 2010 Decision Support Optimization through Predictive Analytics - Leuven Statistical Day 2010 Ernst van Waning Senior Sales Engineer May 28, 2010 Agenda SPSS, an IBM Company SPSS Statistics User-driven product

More information

Analyzing the Web from Start to Finish Knowledge Extraction from a Web Forum using KNIME

Analyzing the Web from Start to Finish Knowledge Extraction from a Web Forum using KNIME Analyzing the Web from Start to Finish Knowledge Extraction from a Web Forum using KNIME Bernd Wiswedel Tobias Kötter Rosaria Silipo Bernd.Wiswedel@knime.com Tobias.Koetter@uni-konstanz.de Rosaria.Silipo@knime.com

More information

Tutorial for proteome data analysis using the Perseus software platform

Tutorial for proteome data analysis using the Perseus software platform Tutorial for proteome data analysis using the Perseus software platform Laboratory of Mass Spectrometry, LNBio, CNPEM Tutorial version 1.0, January 2014. Note: This tutorial was written based on the information

More information

Protein Protein Interaction Networks

Protein Protein Interaction Networks Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics

More information

SSIS Training: Introduction to SQL Server Integration Services Duration: 3 days

SSIS Training: Introduction to SQL Server Integration Services Duration: 3 days SSIS Training: Introduction to SQL Server Integration Services Duration: 3 days SSIS Training Prerequisites All SSIS training attendees should have prior experience working with SQL Server. Hands-on/Lecture

More information

SIPAC. Signals and Data Identification, Processing, Analysis, and Classification

SIPAC. Signals and Data Identification, Processing, Analysis, and Classification SIPAC Signals and Data Identification, Processing, Analysis, and Classification Framework for Mass Data Processing with Modules for Data Storage, Production and Configuration SIPAC key features SIPAC is

More information

Client Overview. Engagement Situation. Key Requirements

Client Overview. Engagement Situation. Key Requirements Client Overview Our client is one of the leading providers of business intelligence systems for customers especially in BFSI space that needs intensive data analysis of huge amounts of data for their decision

More information

RAPIDMINER FREE SOFTWARE FOR DATA MINING, ANALYTICS AND BUSINESS INTELLIGENCE. Luigi Grimaudo 178627 Database And Data Mining Research Group

RAPIDMINER FREE SOFTWARE FOR DATA MINING, ANALYTICS AND BUSINESS INTELLIGENCE. Luigi Grimaudo 178627 Database And Data Mining Research Group RAPIDMINER FREE SOFTWARE FOR DATA MINING, ANALYTICS AND BUSINESS INTELLIGENCE Luigi Grimaudo 178627 Database And Data Mining Research Group Summary RapidMiner project Strengths How to use RapidMiner Operator

More information

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1 Slide 29-1 Chapter 29 Overview of Data Warehousing and OLAP Chapter 29 Outline Purpose of Data Warehousing Introduction, Definitions, and Terminology Comparison with Traditional Databases Characteristics

More information

Microsoft Enterprise Search for IT Professionals Course 10802A; 3 Days, Instructor-led

Microsoft Enterprise Search for IT Professionals Course 10802A; 3 Days, Instructor-led Microsoft Enterprise Search for IT Professionals Course 10802A; 3 Days, Instructor-led Course Description This three day course prepares IT Professionals to administer enterprise search solutions using

More information

LDIF - Linked Data Integration Framework

LDIF - Linked Data Integration Framework LDIF - Linked Data Integration Framework Andreas Schultz 1, Andrea Matteini 2, Robert Isele 1, Christian Bizer 1, and Christian Becker 2 1. Web-based Systems Group, Freie Universität Berlin, Germany a.schultz@fu-berlin.de,

More information

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing

More information

REGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER SCIENCE (MSc[CompSc])

REGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER SCIENCE (MSc[CompSc]) 305 REGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER SCIENCE (MSc[CompSc]) (See also General Regulations) Any publication based on work approved for a higher degree should contain a reference

More information

An Order-Invariant Time Series Distance Measure [Position on Recent Developments in Time Series Analysis]

An Order-Invariant Time Series Distance Measure [Position on Recent Developments in Time Series Analysis] An Order-Invariant Time Series Distance Measure [Position on Recent Developments in Time Series Analysis] Stephan Spiegel and Sahin Albayrak DAI-Lab, Technische Universität Berlin, Ernst-Reuter-Platz 7,

More information

Data, Measurements, Features

Data, Measurements, Features Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are

More information

Data Mining. Vera Goebel. Department of Informatics, University of Oslo

Data Mining. Vera Goebel. Department of Informatics, University of Oslo Data Mining Vera Goebel Department of Informatics, University of Oslo 2011 1 Lecture Contents Knowledge Discovery in Databases (KDD) Definition and Applications OLAP Architectures for OLAP and KDD KDD

More information

ifinder ENTERPRISE SEARCH

ifinder ENTERPRISE SEARCH DATA SHEET ifinder ENTERPRISE SEARCH ifinder - the Enterprise Search solution for company-wide information search, information logistics and text mining. CUSTOMER QUOTE IntraFind stands for high quality

More information

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give

More information

2 Decision tree + Cross-validation with R (package rpart)

2 Decision tree + Cross-validation with R (package rpart) 1 Subject Using cross-validation for the performance evaluation of decision trees with R, KNIME and RAPIDMINER. This paper takes one of our old study on the implementation of cross-validation for assessing

More information

Cheminformatics and Pharmacophore Modeling, Together at Last

Cheminformatics and Pharmacophore Modeling, Together at Last Application Guide Cheminformatics and Pharmacophore Modeling, Together at Last SciTegic Pipeline Pilot Bridging Accord Database Explorer and Discovery Studio Carl Colburn Shikha Varma-O Brien Introduction

More information

KNIME TUTORIAL. Anna Monreale KDD-Lab, University of Pisa Email: annam@di.unipi.it

KNIME TUTORIAL. Anna Monreale KDD-Lab, University of Pisa Email: annam@di.unipi.it KNIME TUTORIAL Anna Monreale KDD-Lab, University of Pisa Email: annam@di.unipi.it Outline Introduction on KNIME KNIME components Exercise: Market Basket Analysis Exercise: Customer Segmentation Exercise:

More information

Christian Dietz M.Sc. University of Konstanz

Christian Dietz M.Sc. University of Konstanz Quantification of dynamic recruitment of repair factors to DNA-Damage and Automated localization, tracking and classification of dividing cells in live cell movies Christian Dietz M.Sc. University of Konstanz

More information

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

More information

Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova

Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova Using the Grid for the interactive workflow management in biomedicine Andrea Schenone BIOLAB DIST University of Genova overview background requirements solution case study results background A multilevel

More information

Decentralised Energy Systems with Focus on Local Energy Production NESEM 2014 Erlangen 25. September 2014

Decentralised Energy Systems with Focus on Local Energy Production NESEM 2014 Erlangen 25. September 2014 Decentralised Energy Systems with Focus on Local Energy Production NESEM 2014 Erlangen 25. September 2014 Bayerisches Zentrum für Angewandte Energieforschung e. V. Alle Rechte vorbehalten, auch bezüglich

More information

Scientific and Technical Applications as a Service in the Cloud

Scientific and Technical Applications as a Service in the Cloud Scientific and Technical Applications as a Service in the Cloud University of Bern, 28.11.2011 adapted version Wibke Sudholt CloudBroker GmbH Technoparkstrasse 1, CH-8005 Zurich, Switzerland Phone: +41

More information

Journée Thématique Big Data 13/03/2015

Journée Thématique Big Data 13/03/2015 Journée Thématique Big Data 13/03/2015 1 Agenda About Flaminem What Do We Want To Predict? What Is The Machine Learning Theory Behind It? How Does It Work In Practice? What Is Happening When Data Gets

More information

Distance Degree Sequences for Network Analysis

Distance Degree Sequences for Network Analysis Universität Konstanz Computer & Information Science Algorithmics Group 15 Mar 2005 based on Palmer, Gibbons, and Faloutsos: ANF A Fast and Scalable Tool for Data Mining in Massive Graphs, SIGKDD 02. Motivation

More information

Course 6234A: Implementing and Maintaining Microsoft SQL Server 2008 Analysis Services

Course 6234A: Implementing and Maintaining Microsoft SQL Server 2008 Analysis Services Course 6234A: Implementing and Maintaining Microsoft SQL Server 2008 Analysis Services Length: Delivery Method: 3 Days Instructor-led (classroom) About this Course Elements of this syllabus are subject

More information

Management von Forschungsprimärdaten und DOI Registrierung. Dr. Matthias Lange (Bioinformatics & Information Technology) June 19 th, 2013

Management von Forschungsprimärdaten und DOI Registrierung. Dr. Matthias Lange (Bioinformatics & Information Technology) June 19 th, 2013 Management von Forschungsprimärdaten und DOI Registrierung Dr. Matthias Lange (Bioinformatics & Information Technology) June 19 th, 2013 Outline Motivation: IPK data infrastructure LIMS: Integration of

More information

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10 1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom

More information

Search and Information Retrieval

Search and Information Retrieval Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search

More information

AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE

AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE ACCELERATING PROGRESS IS IN OUR GENES AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE GENESPRING GENE EXPRESSION (GX) MASS PROFILER PROFESSIONAL (MPP) PATHWAY ARCHITECT (PA) See Deeper. Reach Further. BIOINFORMATICS

More information

Chapter 6 FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT Learning Objectives

Chapter 6 FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT Learning Objectives Chapter 6 FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT Learning Objectives Describe how the problems of managing data resources in a traditional file environment are solved

More information

Visualization methods for patent data

Visualization methods for patent data Visualization methods for patent data Treparel 2013 Dr. Anton Heijs (CTO & Founder) Delft, The Netherlands Introduction Treparel can provide advanced visualizations for patent data. This document describes

More information

University Uses Business Intelligence Software to Boost Gene Research

University Uses Business Intelligence Software to Boost Gene Research Microsoft SQL Server 2008 R2 Customer Solution Case Study University Uses Business Intelligence Software to Boost Gene Research Overview Country or Region: Scotland Industry: Education Customer Profile

More information

Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data

Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data INFO 1500 Introduction to IT Fundamentals 5. Database Systems and Managing Data Resources Learning Objectives 1. Describe how the problems of managing data resources in a traditional file environment are

More information

Final Project Report

Final Project Report CPSC545 by Introduction to Data Mining Prof. Martin Schultz & Prof. Mark Gerstein Student Name: Yu Kor Hugo Lam Student ID : 904907866 Due Date : May 7, 2007 Introduction Final Project Report Pseudogenes

More information

Data processing goes big

Data processing goes big Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,

More information

OpenMS A Framework for Quantitative HPLC/MS-Based Proteomics

OpenMS A Framework for Quantitative HPLC/MS-Based Proteomics OpenMS A Framework for Quantitative HPLC/MS-Based Proteomics Knut Reinert 1, Oliver Kohlbacher 2,Clemens Gröpl 1, Eva Lange 1, Ole Schulz-Trieglaff 1,Marc Sturm 2 and Nico Pfeifer 2 1 Algorithmische Bioinformatik,

More information

Michael Bitter, Robert Bosch GmbH

Michael Bitter, Robert Bosch GmbH Perspective on CO 2 - Penetration in CV-Market Michael Bitter, Robert Bosch GmbH 1 CO 2 Emission [g/km] Perspective on CO 2 - Penetration in CV-Market Development of average CO 2 -emission in Europe Heavy

More information

Massive scale analytics with Stratosphere using R

Massive scale analytics with Stratosphere using R Massive scale analytics with Stratosphere using R Jose Luis Lopez Pino jllopezpino@gmail.com Database Systems and Information Management Technische Universität Berlin Supervised by Volker Markl Advised

More information

ANALYSIS OF WEBSITE USAGE WITH USER DETAILS USING DATA MINING PATTERN RECOGNITION

ANALYSIS OF WEBSITE USAGE WITH USER DETAILS USING DATA MINING PATTERN RECOGNITION ANALYSIS OF WEBSITE USAGE WITH USER DETAILS USING DATA MINING PATTERN RECOGNITION K.Vinodkumar 1, Kathiresan.V 2, Divya.K 3 1 MPhil scholar, RVS College of Arts and Science, Coimbatore, India. 2 HOD, Dr.SNS

More information

TIM 50 - Business Information Systems

TIM 50 - Business Information Systems TIM 50 - Business Information Systems Lecture 15 UC Santa Cruz March 1, 2015 The Database Approach to Data Management Database: Collection of related files containing records on people, places, or things.

More information

Apigee Insights Increase marketing effectiveness and customer satisfaction with API-driven adaptive apps

Apigee Insights Increase marketing effectiveness and customer satisfaction with API-driven adaptive apps White provides GRASP-powered big data predictive analytics that increases marketing effectiveness and customer satisfaction with API-driven adaptive apps that anticipate, learn, and adapt to deliver contextual,

More information

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica

More information

A Comparative Study of Different Log Analyzer Tools to Analyze User Behaviors

A Comparative Study of Different Log Analyzer Tools to Analyze User Behaviors A Comparative Study of Different Log Analyzer Tools to Analyze User Behaviors S. Bhuvaneswari P.G Student, Department of CSE, A.V.C College of Engineering, Mayiladuthurai, TN, India. bhuvanacse8@gmail.com

More information

Big Data Mining Services and Knowledge Discovery Applications on Clouds

Big Data Mining Services and Knowledge Discovery Applications on Clouds Big Data Mining Services and Knowledge Discovery Applications on Clouds Domenico Talia DIMES, Università della Calabria & DtoK Lab Italy talia@dimes.unical.it Data Availability or Data Deluge? Some decades

More information

St Petersburg College. Office of Professional Development. Technical Skills. Adobe

St Petersburg College. Office of Professional Development. Technical Skills. Adobe St Petersburg College Office of Professional Development Technical Skills Adobe Adobe Photoshop PhotoShop CS4: Getting Started PhotoShop CS4: Beyond the Basics Adobe Illustrator Illustrator CS4: Getting

More information

Introduction to Pattern Recognition

Introduction to Pattern Recognition Introduction to Pattern Recognition Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University)

More information

Defense Technical Information Center Compilation Part Notice

Defense Technical Information Center Compilation Part Notice UNCLASSIFIED Defense Technical Information Center Compilation Part Notice ADP012353 TITLE: Advanced 3D Visualization Web Technology and its Use in Military and Intelligence Applications DISTRIBUTION: Approved

More information

Implementing Data Models and Reports with Microsoft SQL Server 2012 MOC 10778

Implementing Data Models and Reports with Microsoft SQL Server 2012 MOC 10778 Implementing Data Models and Reports with Microsoft SQL Server 2012 MOC 10778 Course Outline Module 1: Introduction to Business Intelligence and Data Modeling This module provides an introduction to Business

More information

Anomaly Detection in Predictive Maintenance

Anomaly Detection in Predictive Maintenance Anomaly Detection in Predictive Maintenance Anomaly Detection with Time Series Analysis Phil Winters Iris Adae Rosaria Silipo Phil.Winters@knime.com Iris.Adae@uni-konstanz.de Rosaria.Silipo@knime.com Copyright

More information

CIP Safety on. Joaquin Ocampo, Bosch Rexroth USA Gary Thrall, Bosch Rexroth USA. Drive for Technology Expo

CIP Safety on. Joaquin Ocampo, Bosch Rexroth USA Gary Thrall, Bosch Rexroth USA. Drive for Technology Expo CIP Safety on Joaquin Ocampo, Bosch Rexroth USA Gary Thrall, Bosch Rexroth USA Accelerate your Innovation with CMA/Flodyne/Hydradyne Drive for Technology Expo Trade Show & Technical Symposium April 15-16,

More information

Data Mining mit der JMSL Numerical Library for Java Applications

Data Mining mit der JMSL Numerical Library for Java Applications Data Mining mit der JMSL Numerical Library for Java Applications Stefan Sineux 8. Java Forum Stuttgart 07.07.2005 Agenda Visual Numerics JMSL TM Numerical Library Neuronale Netze (Hintergrund) Demos Neuronale

More information

Lost in Space? Methodology for a Guided Drill-Through Analysis Out of the Wormhole

Lost in Space? Methodology for a Guided Drill-Through Analysis Out of the Wormhole Paper BB-01 Lost in Space? Methodology for a Guided Drill-Through Analysis Out of the Wormhole ABSTRACT Stephen Overton, Overton Technologies, LLC, Raleigh, NC Business information can be consumed many

More information

Challenges and Lessons from NIST Data Science Pre-pilot Evaluation in Introduction to Data Science Course Fall 2015

Challenges and Lessons from NIST Data Science Pre-pilot Evaluation in Introduction to Data Science Course Fall 2015 Challenges and Lessons from NIST Data Science Pre-pilot Evaluation in Introduction to Data Science Course Fall 2015 Dr. Daisy Zhe Wang Director of Data Science Research Lab University of Florida, CISE

More information

Master's projects at ITMO University. Daniil Chivilikhin PhD Student @ ITMO University

Master's projects at ITMO University. Daniil Chivilikhin PhD Student @ ITMO University Master's projects at ITMO University Daniil Chivilikhin PhD Student @ ITMO University General information Guidance from our lab's researchers Publishable results 2 Research areas Research at ITMO Evolutionary

More information

University of Glasgow - Programme Structure Summary C1G5-5100 MSc Bioinformatics, Polyomics and Systems Biology

University of Glasgow - Programme Structure Summary C1G5-5100 MSc Bioinformatics, Polyomics and Systems Biology University of Glasgow - Programme Structure Summary C1G5-5100 MSc Bioinformatics, Polyomics and Systems Biology Programme Structure - the MSc outcome will require 180 credits total (full-time only) - 60

More information

Analytics on Big Data

Analytics on Big Data Analytics on Big Data Riccardo Torlone Università Roma Tre Credits: Mohamed Eltabakh (WPI) Analytics The discovery and communication of meaningful patterns in data (Wikipedia) It relies on data analysis

More information

Some vendors have a big presence in a particular industry; some are geared toward data scientists, others toward business users.

Some vendors have a big presence in a particular industry; some are geared toward data scientists, others toward business users. Bonus Chapter Ten Major Predictive Analytics Vendors In This Chapter Angoss FICO IBM RapidMiner Revolution Analytics Salford Systems SAP SAS StatSoft, Inc. TIBCO This chapter highlights ten of the major

More information

Information Retrieval Elasticsearch

Information Retrieval Elasticsearch Information Retrieval Elasticsearch IR Information retrieval (IR) is the activity of obtaining information resources relevant to an information need from a collection of information resources. Searches

More information

SIMCA 14 MASTER YOUR DATA SIMCA THE STANDARD IN MULTIVARIATE DATA ANALYSIS

SIMCA 14 MASTER YOUR DATA SIMCA THE STANDARD IN MULTIVARIATE DATA ANALYSIS SIMCA 14 MASTER YOUR DATA SIMCA THE STANDARD IN MULTIVARIATE DATA ANALYSIS 02 Value From Data A NEW WORLD OF MASTERING DATA EXPLORE, ANALYZE AND INTERPRET Our world is increasingly dependent on data, and

More information

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics BIG DATA & ANALYTICS Transforming the business and driving revenue through big data and analytics Collection, storage and extraction of business value from data generated from a variety of sources are

More information

PRACTICAL DATA MINING IN A LARGE UTILITY COMPANY

PRACTICAL DATA MINING IN A LARGE UTILITY COMPANY QÜESTIIÓ, vol. 25, 3, p. 509-520, 2001 PRACTICAL DATA MINING IN A LARGE UTILITY COMPANY GEORGES HÉBRAIL We present in this paper the main applications of data mining techniques at Electricité de France,

More information

Clustering & Visualization

Clustering & Visualization Chapter 5 Clustering & Visualization Clustering in high-dimensional databases is an important problem and there are a number of different clustering paradigms which are applicable to high-dimensional data.

More information

Context Aware Predictive Analytics: Motivation, Potential, Challenges

Context Aware Predictive Analytics: Motivation, Potential, Challenges Context Aware Predictive Analytics: Motivation, Potential, Challenges Mykola Pechenizkiy Seminar 31 October 2011 University of Bournemouth, England http://www.win.tue.nl/~mpechen/projects/capa Outline

More information

TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM

TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM Thanh-Nghi Do College of Information Technology, Cantho University 1 Ly Tu Trong Street, Ninh Kieu District Cantho City, Vietnam

More information

Machine Learning with MATLAB David Willingham Application Engineer

Machine Learning with MATLAB David Willingham Application Engineer Machine Learning with MATLAB David Willingham Application Engineer 2014 The MathWorks, Inc. 1 Goals Overview of machine learning Machine learning models & techniques available in MATLAB Streamlining the

More information

6.2.8 Neural networks for data mining

6.2.8 Neural networks for data mining 6.2.8 Neural networks for data mining Walter Kosters 1 In many application areas neural networks are known to be valuable tools. This also holds for data mining. In this chapter we discuss the use of neural

More information

Data Mining as Part of Knowledge Discovery in Databases (KDD)

Data Mining as Part of Knowledge Discovery in Databases (KDD) Mining as Part of Knowledge Discovery in bases (KDD) Presented by Naci Akkøk as part of INF4180/3180, Advanced base Systems, fall 2003 (based on slightly modified foils of Dr. Denise Ecklund from 6 November

More information

Information Architecture

Information Architecture The Bloor Group Actian and The Big Data Information Architecture WHITE PAPER The Actian Big Data Information Architecture Actian and The Big Data Information Architecture Originally founded in 2005 to

More information

Cisco Data Preparation

Cisco Data Preparation Data Sheet Cisco Data Preparation Unleash your business analysts to develop the insights that drive better business outcomes, sooner, from all your data. As self-service business intelligence (BI) and

More information

An intelligent tool for expediting and automating data mining steps. Ourania Hatzi, Nikolaos Zorbas, Mara Nikolaidou and Dimosthenis Anagnostopoulos

An intelligent tool for expediting and automating data mining steps. Ourania Hatzi, Nikolaos Zorbas, Mara Nikolaidou and Dimosthenis Anagnostopoulos An intelligent tool for expediting and automating data mining steps Ourania Hatzi, Nikolaos Zorbas, Mara Nikolaidou and Dimosthenis Anagnostopoulos Outline Data Mining, current tools An intelligent tool

More information

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat Information Builders enables agile information solutions with business intelligence (BI) and integration technologies. WebFOCUS the most widely utilized business intelligence platform connects to any enterprise

More information