KNIME Open Source Days 2012 Sep 3 7, Konstanz, Germany
Mind Era - who are we? Mind Eratosthenes Kft., Budapest, Hungary, mind-era.com Katalin Bakos CEO, sister Gábor Bakos mathematician, software engineer, brother KOS Days 2012
RapidMiner, HiTS what is it? RapidMiner: Another Open Source Framework for data mining We integrated it to KNIME, it works like a metanode HiTS - some nodes to help data analysis of High Throughput/Content Screenings Contains nodes to perform cellhts2 transformations, visualize data, transform data, and a failed experiment to handle/search images using Bio-Formats KOS Days 2012
RapidMiner, HiTS - highlights RapidMiner Node Allows to execute/edit RapidMiner workflows (processes) RapidMiner Viewer Node Helps visualize data Hits nodes Leaf ordering, Reverse Order, Sort by Cluster, Dendrogram with Heatmap, Simple Heatmap,Rank, Direct Product, Merge (kind of antisort), Pivot, Unpivot, Subsets, KOS Days 2012
STARK Joint initiative KNIME + PASCAL2 Prof José L Balcázar (UC, now UPC) Proposer and part time programmer Personnel from Universidad de Cantabria Javier de la Dehesa (senior undergrad, now grad student, coded most of it) Diego García-Sáiz (grad student) Cristina Tîrnauca (post-doc) KOS Days 2012
STARK what is it? Self-Tuning Association Rules for KNIME KNIME node that performs association rule mining with very low configuration needs Tuning support and choosing rule interest measures are very difficult tasks for end users We proposed a self-tuning approach Decreasing support traversal, confidence boost Prototype in Python: yacaree.sf.net Now: Porting it into KNIME Will try to sell it to you all these days... KOS Days 2012
Current status Yacaree Node exists now The confidence boost handling needs a bit of improvement The usage is a bit complicated BUT: the Python version went ahead The KNIME node is a bit behind Algorithms have advanced even further conceptually Trying to catch up this week! KOS Days 2012
GenericWorkflowNodes for SeqAn and OpenMS Freie Universität Berlin Prof. Knut Reinert Head of Algorithmic Bioinformatics group Stephan Aiche Research Associate Björn Kahlert Research Associate KOS Days 2012
GenericWorkflowNodes for SeqAn/OpenMS what is it? GenericWorkflowNodes Wrap existing tools into KNIME nodes Seqan/OpenMS Open Source Frameworks for sequence analysis and analysis of mass spectrometry data Developed at Freie Universität Berlin (SeqAn, OpenMS) and Universität Tübingen (OpenMS) KOS Days 2012
GenericWorkflowNodes - highlights SeqAn/OpenMS Nodes most OpenMS and SeqAn apps available in KNIME CTD (Common Tool Description) a generic XML based description of command line tools Translate any tool you need into a KNIME node based on a CTD for the tool KOS Days 2012
Cortana - who are we? Leiden University Arno Knobbe Post-doc, occational programmer Marvin Meeng Main delevoper Wouter Duivesteijn, Michael Mampaey, Rob Konijn KOS Days 2012
Cortana what is it? Modern Subgroup Discovery tool Developed at Leiden University Research vehical to address problems in Subgroup Discovery Analyses tool used in many domains Bank transaction data Bioinformatics (Genomics/ Metabolimics) Chemical drug compound efficacy KOS Days 2012
Cortana - highlights Generic SD algorithm Target Type/ Quality Measure Search conditions/ Seach strategy Visualisation and manipulation of both Data and Results Table, Histogram, Scatter plot, DAG Change data type, missing values Subgroup inspection, ROC plots KOS Days 2012
Palladian KNIME Open Source Days, Konstanz 03.09.2012 Klemens Muthmann, TU Dresden
About Us Information retrieval team, Lehrstuhl Rechnernetze, TU Dresden Klemens Muthmann Philipp Katz David Urbansky (o. Abb.)
About Palladian Java-based toolkit for information retrieval Provide users with a basic set of tools Palladian s strengths Text classification, feed reading, named entity recognition, date recognition, keyword extraction, content scraping
Highlights Palladian text classifier for sentiment analysis
Highlights
PMM Lab - who are we? Federal Institute for Risk Assessment Germany Christian Thöns Programmer and Research Assistent And others (Matthias Filter, Jörgen Brandt, Armin Weiser, Alexander Falenski) KOS Days 2012
PMM Lab what is it? Collection of KNIME nodes for Predictive Microbiology Developed at the Federal Institute for Risk Assessment since 2011 Provides nodes for fitting and visualizing Predictive Microbiology models KOS Days 2012
KNIME - highlights Views for PMM models and data User can enter new models (model equations are parsed with JEP) KOS Days 2012
Who are we? University of Tübingen / Applied Bioinforma9cs group Exper9se in proteomics/metabolomics, drug design, molecular modelling, sequence analysis, systems biology and immunoinforma9cs Prof. Oliver Kohlbacher Head of Applied Bioinforma9cs Group Luis de la Garza PhD Student Kohlbacher, de la Garza Applied Bioinforma3cs Group 1
What do we do? Workflows on grid systems Integra9on of Computer Aided Drug Design Suite (CADDSuite) and OpenMS as KNIME nodes GenericKnimeNodes development together with FU Berlin GenericKnimeNodes Kohlbacher, de la Garza Applied Bioinforma3cs Group 2
Highlights - CADDSuite Flexible and open workflow- enabled framework for computer- aided drug design Part of the Biochemical Algorithms Library (BALL) Project Offers solu9ons to common tasks in drug design such as file format conversion, molecule prepara9on, docking, etc. Kohlbacher, de la Garza Applied Bioinforma3cs Group 3
Highlights - OpenMS Open mass spectrometry / liquid chromatography C++ library Offers visualiza9on of data, proteomics pipelining, workflow modeling engine, signal processing, feature finding, etc. Kohlbacher, de la Garza Applied Bioinforma3cs Group 4
KNIME Open Source Days 2012 Who are we Robert Bosch GmbH, DS/ETM Alexander Warta Test Engineer, Student Tutor Robert Bosch GmbH Diesel Systems, Engineering Test Methods (DS/ETM1) alexander.warta@de.bosch.com Computer Science Students (Master) Markus John (05/2012-10/2012) two other students 1 Diesel Systems DS/ETM1-Wr, -Jo 03.09.2012 209-2283 Robert Bosch GmbH 2012. Alle Rechte vorbehalten, auch bzgl. jeder Verfügung, Verwertung, Reproduktion, Bearbeitung, Weitergabe sowie für den Fall von Schutzrechtsanmeldungen.
KNIME Open Source Days 2012 Why KNIME Context and Challenge In order to design diesel fuel injection systems for global markets Robert Bosch GmbH considers a lot of specific diesel fuel quality parameters of various markets For this, fuel samples from almost all countries are chemically analyzed by a service provider regularly so-called fuel surveys One survey sample record contains up to 140 attributes, e.g. date, town, country, supplier and the results of chemical and physical analysis like sulfur content, density, viscosity, biodiesel content etc. About 10.000 records are currently of relevance The previous process integrated Microsoft Excel (plots, histograms, etc.) and PowerPoint (world map) in a non-automated succession This procedure is quite time consuming, not interactive, inflexible and not scalable 2 Diesel Systems DS/ETM1-Wr, -Jo 03.09.2012 209-2283 Robert Bosch GmbH 2012. Alle Rechte vorbehalten, auch bzgl. jeder Verfügung, Verwertung, Reproduktion, Bearbeitung, Weitergabe sowie für den Fall von Schutzrechtsanmeldungen.
KNIME Open Source Days 2012 Why KNIME Catalog of Requirements extract knowledge through interactive exploration easy access to all fuel surveys with filter methods generate choropleth maps and cartograms show country names show additional diagrams for each country show only selected countries enrich map with external data (like cities of the fuel survey records, locations of oil refineries, etc.) generate star plots, parallel coordinates, scatterplots apply data mining algorithms for finding new patterns between instances and features (like association rule learning, hierarchical clustering, multidimensional scaling) enrich fuel survey data with external data (like new diesel car registrations, failure count of the common rail system, etc.) 3 Diesel Systems DS/ETM1-Wr, -Jo 03.09.2012 209-2283 Robert Bosch GmbH 2012. Alle Rechte vorbehalten, auch bzgl. jeder Verfügung, Verwertung, Reproduktion, Bearbeitung, Weitergabe sowie für den Fall von Schutzrechtsanmeldungen.
KNIME Open Source Days 2012 Highlights KNIME Node GenericWorldMap generating world maps based on statistical attributes, additional dimensions with bars and scalable icons 4 Diesel Systems DS/ETM1-Wr, -Jo 03.09.2012 209-2283 Robert Bosch GmbH 2012. Alle Rechte vorbehalten, auch bzgl. jeder Verfügung, Verwertung, Reproduktion, Bearbeitung, Weitergabe sowie für den Fall von Schutzrechtsanmeldungen.
KNIME Open Source Days 2012 Highlights KNIME Node FuelSurveyVisualizer generating boxplots, starplots, etc. interactively by integrating R 5 Diesel Systems DS/ETM1-Wr, -Jo 03.09.2012 209-2283 Robert Bosch GmbH 2012. Alle Rechte vorbehalten, auch bzgl. jeder Verfügung, Verwertung, Reproduktion, Bearbeitung, Weitergabe sowie für den Fall von Schutzrechtsanmeldungen.
KNIME Open Source Days 2012 Highlights KNIME Node FuelSurveyStandardAnalysis creating standard presentation slides automatically by integrating Apache POI and R 6 Diesel Systems DS/ETM1-Wr, -Jo 03.09.2012 209-2283 Robert Bosch GmbH 2012. Alle Rechte vorbehalten, auch bzgl. jeder Verfügung, Verwertung, Reproduktion, Bearbeitung, Weitergabe sowie für den Fall von Schutzrechtsanmeldungen.
KNIME Open Source Days 2012 Highlights KNIME Node FuelSurveyWarnSystem early warning system to identify worsening fuel quality fast by integrating JBoss Drools (rule-based system) and Apache POI (generating Excel- and Word-file output) ongoing 7 Diesel Systems DS/ETM1-Wr, -Jo 03.09.2012 209-2283 Robert Bosch GmbH 2012. Alle Rechte vorbehalten, auch bzgl. jeder Verfügung, Verwertung, Reproduktion, Bearbeitung, Weitergabe sowie für den Fall von Schutzrechtsanmeldungen.
KNIME Open Source Days 2012 Developed KNIME Nodes Selection Preprocessing Transformation FuelSurveyReader FuelSurveyDeleter StandardAnalysisXML Modeling Neighbors LocalOutlierDetection DistanceBasedkMeans Percentizer RefineryReader Visualization GenericWorldMap FuelSurveyVisualizer StandardAnalysis LocationTransformer DynamicColumnFilter MultipleReference RowFilter LoopColumnToVariable Elbow FuelSurveyWarnSystem FuelSurveyWarnSystemXML 8 Diesel Systems DS/ETM1-Wr, -Jo 03.09.2012 209-2283 Robert Bosch GmbH 2012. Alle Rechte vorbehalten, auch bzgl. jeder Verfügung, Verwertung, Reproduktion, Bearbeitung, Weitergabe sowie für den Fall von Schutzrechtsanmeldungen.
Who are we? Christian Dietz Image Processing Martin Horn Image Processing Tobias Kötter Network Mining Michael Zinsmaier Image Processing KOS Days 2012
Our Projects (1/2) Network Mining Framework to process attributed graphs Supports (un)directed, (un)weighted (hyper/multi/k-partite) graphs Indexing & Searching High-performance indexing and advanced querying Bases on Apache Lucene KOS Days 2012
Our Projects (2/2) Image Processing and Analysis Extension to process and analyse multidimensional images Integrates state-of-the-art libraries ImgLib2 BioFormats ImageJ ImageJ2 OMERO KOS Days 2012
KNIME Iris Adä Modular Data Generation, Ensemble Methods, JFreeChart Zaenal Akbar Parallel Data Mining Violeta Ivanova Parallel Data Mining Sebastian Peter Web Analytics. JFreeChart 05.09.2012 KNIME Open Source Days 18
KNIME Dawid Piatek Statistics Guru Thorsten Meinl Optimization & Build System Thomas Gabriel Database Connectors & R Peter Ohl File Reader & Server Development 05.09.2012 KNIME Open Source Days 19
KNIME Bernd Wiswedel Data Handling Aaron Hart (Magic) Support Michael Berthold The Godfather 05.09.2012 KNIME Open Source Days 20
KNIME Heather Fyson Keeps everything running Peter Burger System Administrator & BBQ master 05.09.2012 KNIME Open Source Days 21