RAPIDMINER FREE SOFTWARE FOR DATA MINING, ANALYTICS AND BUSINESS INTELLIGENCE. Luigi Grimaudo 178627 Database And Data Mining Research Group

Similar documents
The basic data mining algorithms introduced may be enhanced in a number of ways.

An Introduction to Data Mining

PARAMETRIC COMPARISON OF DATA MINING TOOLS

DATA MINING ALPHA MINER

Radoop: Analyzing Big Data with RapidMiner and Hadoop

Introduction Predictive Analytics Tools: Weka

An Introduction to WEKA. As presented by PACE

DBTech Pro Workshop. Knowledge Discovery from Databases (KDD) Including Data Warehousing and Data Mining. Georgios Evangelidis

The Prophecy-Prototype of Prediction modeling tool

Introduction to Data Mining

IBM SPSS Modeler 15 In-Database Mining Guide

Open Source Business Intelligence Intro

DATA MINING USING PENTAHO / WEKA

Data Mining Solutions for the Business Environment

GeoKettle: A powerful open source spatial ETL tool

SAS Business Intelligence Online Training

Didacticiel Études de cas. Association Rules mining with Tanagra, R (arules package), Orange, RapidMiner, Knime and Weka.

Azure Machine Learning, SQL Data Mining and R

Prof. Pietro Ducange Students Tutor and Practical Classes Course of Business Intelligence

Quick start. A project with SpagoBI 3.x

DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

SAP BusinessObjects Business Intelligence (BI) platform Document Version: 4.1, Support Package Report Conversion Tool Guide

Knowledge Discovery in Data with FIT-Miner

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

Pentaho Data Mining Last Modified on January 22, 2007

Review on Data Mining Tools

Welcome to the second half ofour orientation on Spotfire Administration.

OWB Users, Enter The New ODI World

Sunnie Chung. Cleveland State University

RapidMiner Radoop Documentation

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery

Name: Srinivasan Govindaraj Title: Big Data Predictive Analytics

An intelligent tool for expediting and automating data mining steps. Ourania Hatzi, Nikolaos Zorbas, Mara Nikolaidou and Dimosthenis Anagnostopoulos

Oracle Business Intelligence EE. Prab h akar A lu ri

How To Solve The Kd Cup 2010 Challenge

Data Mining Tools and Trends An Overview S.Hameetha Begum * Senior Lecturer, Dept of Computing, Muscat College, Oman

ITG Software Engineering

THE COMPARISON OF DATA MINING TOOLS

A Comparative study of Techniques in Data Mining

Open source framework for data-flow visual analytic tools for large databases

OpenText Actuate Big Data Analytics 5.2

A Case of Study on Hadoop Benchmark Behavior Modeling Using ALOJA-ML

Data Mining. Knowledge Discovery, Data Warehousing and Machine Learning Final remarks. Lecturer: JERZY STEFANOWSKI

Ethar Ibrahim Elsaka

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

LISTE DES DOCUMENTS ORACLE

Automated Data Ingestion. Bernhard Disselhoff Enterprise Sales Engineer

Pentaho Data Integration 4 and MySQL. Matt Casters: Pentaho's Chief Data Integration Kettle Project Founder

Data Mining Analytics for Business Intelligence and Decision Support

In this presentation, you will be introduced to data mining and the relationship with meaningful use.

MicroStrategy Course Catalog

What is Data Mining? Data Mining (Knowledge discovery in database) Data mining: Basic steps. Mining tasks. Classification: YES, NO

Introduction to Datawarehousing

Machine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer

ORACLE OLAP. Oracle OLAP is embedded in the Oracle Database kernel and runs in the same database process

Distance Learning and Examining Systems

Cloud Ready Data: Speeding Your Journey to the Cloud

Extend your analytic capabilities with SAP Predictive Analysis

Model Deployment. Dr. Saed Sayad. University of Toronto

Oracle Advanced Analytics Oracle R Enterprise & Oracle Data Mining

DATA MINING AND WAREHOUSING CONCEPTS

Big Data & Security. Aljosa Pasic 12/02/2015

Chapter 5. Warehousing, Data Acquisition, Data. Visualization

Comparative Study in Building of Associations Rules from Commercial Transactions through Data Mining Techniques

<Insert Picture Here> Introducing Data Modeling and Design with Oracle SQL Developer Data Modeler

Knowledge Discovery Process and Data Mining - Final remarks

Analytics on Big Data

Oracle Business Intelligence Foundation Suite 11g Essentials Exam Study Guide

Analytics. For Anyone. Be Heroic Turn Data into Action

and BI Services Overview CONTACT W: E: M: +385 (91) A: Lastovska 23, Zagreb, Croatia

COURSE RECOMMENDER SYSTEM IN E-LEARNING

An Overview of Knowledge Discovery Database and Data mining Techniques

RapidMiner. Business Analytics Applications. Data Mining Use Cases and. Markus Hofmann. Ralf Klinkenberg. Rapid-I / RapidMiner.

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing

Bussiness Intelligence and Data Warehouse. Tomas Bartos CIS 764, Kansas State University

BIG DATA SOLUTION DATA SHEET

Introduction. Fact Sheet RapidMiner and RapidAnalytics. Business Analytics fast and powerful

Introduction to Data Mining

A Systemic Artificial Intelligence (AI) Approach to Difficult Text Analytics Tasks

LDAPCON Sébastien Bahloul

Essential Elements of a Master Data Management Architecture

Adam Rauch Partner, LabKey Software Extending LabKey Server Part 1: Retrieving and Presenting Data

Log Mining Based on Hadoop s Map and Reduce Technique

Business Information System Courses Description

Deploying MATLAB -based Applications David Willingham Senior Application Engineer

ANALYTICS CENTER LEARNING PROGRAM

Application Performance Monitoring for WhatsUp Gold v16.1 User Guide

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

BarTender Integration Methods. Integrating BarTender s Printing and Design Functionality with Your Custom Application WHITE PAPER

Analysis Tools and Libraries for BigData

Course DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

5.5 Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall. Figure 5-2

Transcription:

RAPIDMINER FREE SOFTWARE FOR DATA MINING, ANALYTICS AND BUSINESS INTELLIGENCE Luigi Grimaudo 178627 Database And Data Mining Research Group

Summary RapidMiner project Strengths How to use RapidMiner Operator highlights RapidMiner GUI References

RapidMiner Project A fully integrated environment for machine learning, data mining, text mining, predictive analytics and business intelligence It is distributed under the AGPL open source license and has been hosted by SourceForge since 2004 It can be used as a stand-alone application for data analysis or as a data mining engine for the integration into own code

RapidMiner Project Short History: The RapidMiner project started in 2001 with the name of YALE (Yet Another Learning Environment). The main contributors were Ralf Klinkenberg, Ingo Mierswa, and Simon Fischer from the Artificial Intelligence Unit of the University of Dortmund In 2006 Ingo Mierswa and Ralf Klinkenberg founded the company Rapid-I and on Tuesday, May 29th, 2007 it change the name of the software in RapidMiner Now Rapid-I ensures the maintenance and further development of RapidMiner and the support of its users

Strengths Data Integration, Analytical ETL, Data Analysis, and Reporting in one single tool Friendly and powerful graphical user interface for the design of analysis processes Repositories for process, data and metadata handling On-the-fly error recognition and quick fixes to help users Complete and flexible library with hundreds of data loading, data transformation, data modeling and data visualization methods Available open-source code (Java) to extend and modify the data mining and analysis system

Strengths Well-known machine learning library WEKA fully integrated in the core of the system Internal XML representation ensures standardized interchange format of processes Graphical process design for standard tasks and scripting language for arbitrary operations Access to data sources like Excel, Access, Oracle, IBM DB2, Microsoft SQL, Sybase, Ingres, MySQL, Postgres, SPSS, dbase, Text files and more Very powerful high-dimensional plotting facilities Several plugin already available and extension mechanisms enable the customization and the adaptation of operators

How to use RapidMiner RapidMiner can be used in several ways: As a standalone tool by means of the simple GUI, connecting the requested operators to build your process, executing it and getting its result directly in the RapidMiner enviroment As a batch process one can build the workflow by means of the GUI and then execute it running the RapidMiner script with the XML process as input As a Java API one can integrate the RapidMiner facilities in your own data mining or business intelligence code building the requested process directly inside the java code As an hybrid solution one can build the process with the GUI to executing and to managing it inside a Java code

Operator highlights Data mining modeling: Support Vector Machines (SVM), Rule learners Decision trees Bayes Gaussian Processes Neural Networks Evolutionary optimization Boosting Apriori FPGrowth Clustering and many others

Operator highlights ETL and OLAP: Aggregation Discretization Normalization Filter Sampling PCA Missing value replenishment Lot more

Operator highlights Evaluation: Cross-validation Leave-one-out Sliding time windows Back testing Significance tests ROC Etc.

RapidMiner GUI The GUI generates an XML file that defines the analytical processes the user wishes to apply to the data. This file is then read by the RapidMiner engine to run the analyses automatically. While these are running, the GUI can also be used to interactively control and inspect running processes.

RapidMiner GUI Perspectives Design Perspective: is the central RapidMiner perspective where all analysis processes are created and managed Result Perspective: If a process supplies results then RapidMiner takes you to this Result Perspective Welcome Perspective: first perspective when RapidMiner is lunched, where you can see the last executed processes and some logs.

Design Perspective In this view all the work steps (called operators) available in RapidMiner are presents and they are used as building block for every process. The repository section serves for the management and structuring of your analysis processes into projects and at the same time as both a source of data as well as of the associated metadata.

Design Perspective The process view shows the individual steps within the analysis process as well as their connections. New steps can be added to the current process. Connections between them can de defined and detached.

Operators Working with RapidMiner fundamentally consists in defining analysis process by indicating a succession of operators. The inputs and outputs of operators are generated and consumed by ports. Every operator is defined by its inputs, outputs, action performed and parameters.

Result Perspective Objects which are placed at the result ports at the right-hand side of a process are automatically displayed in the Result Perspective after the process is completed. Each currently opened and indicated result is displayed as an additional tab in this area.

Plot View One of the strongest features of RapidMiner are the numerous visualisation methods for data, other tables, modells and results offered in the Plot View.

References Mierswa, I., Wurst, M., Klinkenberg, R., Scholz, M. and Euler, T., Yale (now: RapidMiner): Rapid Prototyping for Complex Data Mining Tasks. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2006) http://rapid-i.com