BET : An Inductive Logic Programming Workbench
|
|
|
- Rose Lee
- 10 years ago
- Views:
Transcription
1 BET : An Inductive Logic Programming Workbench Srihari Kalgi, Chirag Gosar, Prasad Gawde Ganesh Ramakrishnan, Kekin Gada, Chander Iyer, Kiran T.V.S, and Ashwin Srinivasan Department of Computer Science and Engineering IIT Bombay, India {cgosar,ganesh}@cse.iitb.ac.in {chanderjayaraman,kekin.gada,prasg41,srihari.kalgi,t.kiran05}@gmail.com [email protected] Abstract. Existing ILP (Inductive Logic Programming) systems are implemented in different languages namely C, Progol, etc. Also, each system has its customized format for the input data. This makes it very tedious and time consuming on the part of a user to utilize such a system for experimental purposes as it demands a thorough understanding of that system and its input specification. In the spirit of Weka [1], we present a relational learning workbench called BET(Background + Examples = Theories), implemented in Java. The objective of BET is to shorten the learning curve of users (including novices) and to facilitate speedy development of new relational learning systems as well as quick integration of existing ILP systems. The standardized input format makes it easier to experiment with different relational learning algorithms on a common dataset. Key words: BET, ILP systems, Golem, Progol, FOIL, PRISM, TILDE 1 Introduction There have been several Inductive Logic Programming (ILP) system implementations. Each system has its own specifications for input data. Different systems do not necessarily agree on their inputs. This often makes comparisons across different implementations tricky, owing to either a difference in names or semantics of parameters and/or a difference in the choice of the programming language. Very often, the primitives and core components employed, such as theorem provers, SAT solvers, inference engines, etc., are also different across different implementations, rendering the computation and accuracy comparisons less reliable. This paper discusses a Workbench for ILP called BET. BET is developed in Java and it standardizes the specification of input using XML (extensible Markup Language). It provides a common framework for building as well as integrating different ILP systems. The provision for implementing
2 2 Srihari Kalgi et. al. algorithms in a common language (Java) improves the feasibility of comparing algorithms on their computational speeds. Whereas, the input standardization enables sound comparison of accuracies (or related measures) and also allows for experimenting with multiple ILP systems on the same dataset without any input conversion required to the system specific format. BET includes several evaluation functions and operator primitives such as Least General Generalization (LGG) [9], Upward cover, Downward cover, etc. These features facilitate the rapid development of new relational learning systems and also ease out the integration of existing relational systems into BET. BET also allows a user to choose from multiple theorem provers as plug and play components. Presently, YAP (Yet Another Prolog) [12] and SWI-Prolog [13] are included in the basic package. The initial version of BET has three relational systems implemented namely FOIL [4], Golem [3] and TILDE [2][10] and four relational systems integrated namely Progol [11], FOIL, Golem and PRISM [5][6] (Though PRISM is not a standard ILP system, it has been integrated with the specific intension of learning probabilities in order to associate uncertainity with theories learnt using ILP systems). We proceed with an overview of BET and the motivation to develop such a system in Section 2. Section 3 focuses on the design of the BET system. Section 4 explains how new relational learning systems can be integrated/implemented in BET. Section 5 compares BET with existing workbenches and highlights the advantages of BET. We summarize BET in Section 6. 2 Overview A principal motivation for developing a system like BET is the reduction of the learning curve for both expert-users and novices as well as programmers in the area of relational learning, particularly ILP. For example, with BET, the enduser will need to understand only the standardized input parameters for BET. This reduces the time overheads involved in comparing different algorithmic implementations as well as in experimenting with individual implementations. More specifically, a user need not convert datasets from one format to another while switching between ILP systems. Further, the standardized APIs in BET make development of algorithms within BET much easier. Figure 1 shows the block diagram of BET. Any ILP system in BET takes four files as its input namely positive examples, negative examples, background knowledge and language restriction files, and it outputs in the form of theories. There are primarily two important components of BET namely the BET GUI and the BET Engine. The BET Engine is the back-end engine at the core of BET and the BET GUI communicates with the BET engine through the medium of a properties file. BET users can experiment with the system using the GUI whereas programmers can use the APIs provided and extend BET with new ILP or relational learning systems. Section 5 explains how BET could be extended.
3 BET : An ILP Workbench 3 Fig. 1. Block diagram of BET 3 Design of BET Almost all ILP algorithms as well as systems require following five inputs: positive examples, negative examples, background knowledge, mode declarations and hyper-parameters for tuning the system. The clauses learnt may be customized by means of the mode declarations, though some algorithms/systems may not allow for the specification of mode declarations (example is Confidence-based Concept Discovery system [7]). In this Section, we discuss the need for having a standard for the input, and eventually specify the BET standard input file formats. 3.1 Need for Standard Format There is little or no consensus between the input specifications of ILP systems. For example, FOIL has its standard input specification which is completely different from that of Golem or Progol. FOIL accepts the positive examples, negative examples and background knowledge in a single file. On the other hand, Golem takes three files as input viz. positive example file (.f ), negative example file (.n) and the background knowledge file (.b) which includes the mode declarations. Another ILP system Progol has an input format that is slightly similar to Golem, but has many more setting parameters. Clearly any end user, desiring to experiment with a new ILP system, is required to understand its input format thoroughly, convert the data in hand to that format and only then can he/she use it. BET assists the end user with a standardized input specification that is generic enough and can be converted to the specification of any existing ILP system. Section 3.2 will discuss about BET standard formats. 3.2 BET Standard Format As already stated, BET s inputs are comprised of four XML files namely positive examples, negative examples, background knowledge and language restriction file. Sample positive example, negative example, background knowledge and language restriction files for learning the ancestor relation are shown in Table 1,
4 4 Srihari Kalgi et. al. Table 2, and Table 3 respectively in the document available at [15]. The hyperparameters can be specified in Language Restriction file. The format for each file is as follows: Positive Example File: The file consists of clauses which represents positive examples for the relation (predicate) for which the theory needs to be learned. Negative Example File: This file consists of clauses which represents negative examples of the predicate (relation) for which the theory needs to be learned. There may be some ILP systems which do not consider any negative examples during construction of theory, but may utilize negative examples for pruning the theories that have been learned. Background knowledge File: The background knowledge (BGK) file contains all the domain specific information which is required by the ILP system in order to learn the target predicate. All this information should be in the form of clauses. It can also contain rules instead of just facts. Language Restriction File: The structured search space (lattice or simply a partial order) of clauses which needs to be searched in order to arrive at a reasonably good hypothesis is almost infinite. Language Restrictions endow an ILP system with the flexibility of reducing the search space. There are three generic parts to the Language Restriction viz. Mode declaration, Type and Determination. Also we can specify the system-specific hyper-parameters like stopping threshold, minimum accuracy, etc. in this file only. Each of them, is explained below. Type: Type defines the type of constants. For example, the constant john is of type person, constant 1 is of type integer. Type specification can be done in the language restriction file. Mode Declarations: Mode declarations declare the mode of the predicates that can be present in any of clausal theories. The mode declaration takes the form mode(recallnumber,predicatemode), where RecallNumber: bounds the non-determinacy of a form of predicate call. PredicateMode: has the following syntax predicate(modetype,modetype,...) predicate(<+/-/#>type,<+/-/#>type,...) +Type : the term must be an input variable of type Type. -Type : the term must be an output variable of type Type. #Type : the term must be a ground constant of type Type.
5 BET : An ILP Workbench 5 Determination: This component of mode declarations provides information regarding which predicates need to be learned in terms of which other predicates. Hyper-parameters: Hyper-parameters are very much system specific and we can find the different parameters for a system by doing system-name help or through the graphical interface for that particular system. 4 Class Diagram and Components of BET The class diagram of BET is shown in Figure 1 in the document available at [15]. The major components of BET are explained in subsequent sections. ClauseSet: ClauseSet refers to a set of clauses, which is often referred to in the description of ILP algorithms. The ClauseSet interface defines the signature of the methods used to access the ClauseSet object. LanguageRestriction: LanguageRestriction is an interface for mode declarations, determinations and types. InputEncoder: InputEncoder is mainly used for integrating legacy systems (i.e. ILP systems written in native code) into BET. Any wrapper around an integrated legacy system will require conversion of the BET standard data format to the format accepted by the legacy system. Any legacy system which requires to be integrated into BET has to implement InputEncoder interface and provide a method of converting the standard BET file format to the format accepted by the legacy system. TheoryLearner: TheoryLearner is an important component of BET. It is responsible for learning the theory (hypothesis) from the input. TheoryLearner has objects of type InputEncoder and EvalMetric (explained next) as its data members. In case the ILP system is completely implemented in Java (i.e., if it is not a legacy system), it can use a dummy implementation of the InputEncoder interface and pass the object to the TheoryLearner class. The dummy implementation of InputEncoder interface is mainly used to access the BET input files. EvalMetric (Evaluation Metric): Any ILP algorithm will try to explore the structured space (subsumption lattice or partial order) of clauses, by traversing the graph using refinement operators. At every point in the space, the algorithm has to decide whether a clause in the space is good enough to be present in the hypothesis. For this purpose, the covers relation is employed. If the covers relation is defined by Entailment ( =), then the ILP system is said to learn from entailment [9]. Some other ILP systems learn from proof traces [9], while few others learn from interpretations [9], etc.
6 6 Srihari Kalgi et. al. Any implementation of EvalMetric will specify the covers relation, i.e., definition of an example being covered by background knowledge (with candidate hypothesis). TheoremProver: The theorem prover is the backbone of most ILP systems. BET has two different theorem provers namely YAP and SWI-Prolog. YAP is perhaps the fastest theorem prover as of today. YAP is built in C and doesn t have any JNI (Java Native Interface), so the YAP process is run by spawning a process through a BET program. SWI-Prolog is also built in C, but it does support JNI (Java Native Interface). SWI-Prolog has a standard interface called JPL which comes bundled with SWI-Prolog itself. 5 Support for Building ILP Systems in BET It is very easy to integrate 1 an existing system into BET. Also the implementation 2 of a new system within BET is faster than implementing it from scratch. The following section briefs on how to integrate and implement a system in BET. 5.1 Integrating a Legacy System in BET Legacy systems can be integrated into BET as follows: Implement the interface InputEncoder, which should provide a functionality to convert files from BET standard format to the format accepted by the legacy system. Extend the TheoryLearner class and override the method learntheory. 5.2 Implementing a New ILP System in BET A dummy implementation of the interface InputEncoder is required for this purpose, since the TheoryLearner class expects an object of InputEncoder while creating its object. Extend the TheoryLearner class and override the method learntheory. Different types of Evaluation Metric, Theorem provers (YAP, SWI-Prolog) can be used in implementing the ILP system. 6 Related Work Aleph and GILPS (Generic Inductive Logic Programming System) are two earlier efforts put in to develop a workbench like BET for Inductive Logic Programming. Aleph [8] is developed by Ashwin Srinivasan, whereas the GILPS [14] workbench is developed by Jose Santos. 1 The authors of this paper took two weeks to integrate PRISM in BET and one week individually for integrating Golem and FOIL 2 The authors of this paper implemented TILDE, FOIL and Golem in BET within a week for each algorithm
7 BET : An ILP Workbench Aleph Aleph is written in Prolog principally for use with Prolog compilers like Yap and SWI-Prolog compiler. Aleph offers a number of parameters to experiment with its basic algorithm. These parameters can be used to choose from different search strategies, various evaluation functions, user-defined constraints and cost specifications. Aleph also allows user-defined refinement operators. 6.2 GILPS GILPS implements three relational systems namely TopLog, FuncLog and Pro- Golem. GILPS requires at least YAP 6.0 for its execution. Many of its user predicates and settings are shared with Aleph. 6.3 Advantages of BET over Existing Systems BET offers following advantages over existing systems (Aleph, GILPS,etc.): BET is more likely to find use in the larger applied machine learning and Mining communities who are less familiar with prolog and where it is important to interface learning tools in Java (such as BET) with applications largely written in procedural languages (very often Java). This can also make diagnosis and debugging easy. As against this, GILPS and Aleph assume familiarity with Prolog and cannot be easily invoked from applications written in Java. In the spirit of Weka, BET provides a nice framework for the integration of legacy systems as well as for the implementation of new systems. In fact, while the basic BET framework was provided by the first two co-authors, the remaining co-authors used that framework to implement several ILP algorithms. This is something that Aleph and GILPS do not cater to as well. BET allows saving the learned models as serialized files which can later be used for finding models for new problems. BET provides a choice of theorem prover where as GILPS can work only with YAP. BET comes with a YAP installer and all the integrated systems. BET has classes to convert the background knowledge, positive examples and negative examples files into XML format. Language restriction file needs little human intervention for this purpose. BET provides a standard input/output format which enables end-user to experiment with multiple relational systems with the same datasets. Aleph uses three different files namely (.f), (.n) and (.b) whereas GILPS takes input as prolog files (.pl).
8 8 Srihari Kalgi et. al. 7 Summing up BET The standardized input format of BET makes it more convenient for user to have only one input format for all relational systems. The ability of BET to write learned classification/regression models as serialized files allows for saving the model and reusing the model at a later stage for classification or regression on new problems. The BET graphical user interface makes experimentation with implemented algorithms and integrated systems much more easier. Use of JAVA as an implementation language makes BET extensible and platform independent. The only visible disadvantage of BET is that the JAVA implementation of relational systems are slower as compared to their C/C++/Progol counterparts, but we chose Java over C++ owing to ease of extending and platform independence of the former over the latter. References 1. E. Frank and M. A. Hall and G. Holmes and R. Kirkby and B. Pfahringer and I. H. Witten: Weka- A machine learning workbench for data mining. In : Data Mining and Knowledge Discovery Handbook- A Complete Guide for Practitioners and Researchers, pp Springer, Berlin (2005) 2. Blockeel, Hendrik and De Raedt, Luc: Top-down induction of first-order logical decision trees. In: Artificial Intelligence Journal (1998), pp Elsevier Science Publishers Ltd. 3. S.H. Muggleton and C. Feng: Efficient induction of logic programs. : Proceedings of the First Conference on Algorithmic Learning Theory, pp Ohmsha, Tokyo (1990) 4. J. Ross Quinlan and R. Mike Cameron-Jones: FOIL: A Midterm Report. : Machine Learning: ECML-93, European Conference on Machine Learning, Proceedings, vol. 667 pp Springer-Verlag (1993) 5. Taisuke Sato, Yoshitaka Kameya, Neng-Fa Zhou: Generative Modeling with Failure in PRISM. : IJCAI, pp (2005) 6. Taisuke Sato and Yoshitaka Kameya: PRISM: A Language for Symbolic-Statistical Modeling. : IJCAI, pp (1997) 7. Y. Kavurucu, P. Senkul and I.H. Toroslu: ILP-based concept discovery in multirelational data mining. : Expert Systems with Applications, vol. 36 pp (2009) 8. Aleph Manual, Aleph/aleph.html 9. Statistical Relational Learning Notes, notes/ 10. TILDE: Top-down Induction of Logical Decision Trees, ~ilpnet2/systems/tilde.html 11. Progol, YAP Manual, SWI-Prolog, General Inductive Logic Programming System, ~jcs06/gilps/ 15. ILP Systems Integrated and Implemented in BET and Sample Examples,http: //
Background knowledge-enrichment for bottom clauses improving.
Background knowledge-enrichment for bottom clauses improving. Orlando Muñoz Texzocotetla and René MacKinney-Romero Departamento de Ingeniería Eléctrica Universidad Autónoma Metropolitana México D.F. 09340,
Meta-Interpretive Learning of Data Transformation Programs
Meta-Interpretive Learning of Data Transformation Programs Andrew Cropper, Alireza Tamaddoni-Nezhad, and Stephen H. Muggleton Department of Computing, Imperial College London Abstract. Data Transformation
Data Mining & Data Stream Mining Open Source Tools
Data Mining & Data Stream Mining Open Source Tools Darshana Parikh, Priyanka Tirkha Student M.Tech, Dept. of CSE, Sri Balaji College Of Engg. & Tech, Jaipur, Rajasthan, India Assistant Professor, Dept.
Axiomatic design of software systems
Axiomatic design of software systems N.P. Suh (1), S.H. Do Abstract Software is playing an increasingly important role in manufacturing. Many manufacturing firms have problems with software development.
Analyzing & Debugging ILP Data Mining Query Execution
Analyzing & Debugging ILP Data Mining Query Execution Remko Tronçon Katholieke Universiteit Leuven Dept. of Computer Science Celestijnenlaan 200A B-3001 Leuven, Belgium [email protected] Gerda
Implementation of hybrid software architecture for Artificial Intelligence System
IJCSNS International Journal of Computer Science and Network Security, VOL.7 No.1, January 2007 35 Implementation of hybrid software architecture for Artificial Intelligence System B.Vinayagasundaram and
ExSched: Solving Constraint Satisfaction Problems with the Spreadsheet Paradigm
ExSched: Solving Constraint Satisfaction Problems with the Spreadsheet Paradigm Siddharth Chitnis, Madhu Yennamani, Gopal Gupta Department of Computer Science The University of Texas at Dallas Richardson,
ExSched: Solving Constraint Satisfaction Problems with the Spreadsheet Paradigm
ExSched: Solving Constraint Satisfaction Problems with the Spreadsheet Paradigm Siddharth Chitnis, Madhu Yennamani, Gopal Gupta Department of Computer Science The University of Texas at Dallas Richardson,
Analysis of WEKA Data Mining Algorithm REPTree, Simple Cart and RandomTree for Classification of Indian News
Analysis of WEKA Data Mining Algorithm REPTree, Simple Cart and RandomTree for Classification of Indian News Sushilkumar Kalmegh Associate Professor, Department of Computer Science, Sant Gadge Baba Amravati
The Prolog Interface to the Unstructured Information Management Architecture
The Prolog Interface to the Unstructured Information Management Architecture Paul Fodor 1, Adam Lally 2, David Ferrucci 2 1 Stony Brook University, Stony Brook, NY 11794, USA, [email protected] 2 IBM
Chapter 1. Dr. Chris Irwin Davis Email: [email protected] Phone: (972) 883-3574 Office: ECSS 4.705. CS-4337 Organization of Programming Languages
Chapter 1 CS-4337 Organization of Programming Languages Dr. Chris Irwin Davis Email: [email protected] Phone: (972) 883-3574 Office: ECSS 4.705 Chapter 1 Topics Reasons for Studying Concepts of Programming
A Fast Partial Memory Approach to Incremental Learning through an Advanced Data Storage Framework
A Fast Partial Memory Approach to Incremental Learning through an Advanced Data Storage Framework Marenglen Biba, Stefano Ferilli, Floriana Esposito, Nicola Di Mauro, Teresa M.A Basile Department of Computer
Evaluating an Integrated Time-Series Data Mining Environment - A Case Study on a Chronic Hepatitis Data Mining -
Evaluating an Integrated Time-Series Data Mining Environment - A Case Study on a Chronic Hepatitis Data Mining - Hidenao Abe, Miho Ohsaki, Hideto Yokoi, and Takahira Yamaguchi Department of Medical Informatics,
Efficient Algorithms for Decision Tree Cross-validation
Journal of Machine Learning Research 3 (2002) 621-650 Submitted 12/01; Published 12/02 Efficient Algorithms for Decision Tree Cross-validation Hendrik Blockeel Jan Struyf Department of Computer Science
Chapter 13: Program Development and Programming Languages
Understanding Computers Today and Tomorrow 12 th Edition Chapter 13: Program Development and Programming Languages Learning Objectives Understand the differences between structured programming, object-oriented
EFFICIENCY CONSIDERATIONS BETWEEN COMMON WEB APPLICATIONS USING THE SOAP PROTOCOL
EFFICIENCY CONSIDERATIONS BETWEEN COMMON WEB APPLICATIONS USING THE SOAP PROTOCOL Roger Eggen, Sanjay Ahuja, Paul Elliott Computer and Information Sciences University of North Florida Jacksonville, FL
DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES
DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES Vijayalakshmi Mahanra Rao 1, Yashwant Prasad Singh 2 Multimedia University, Cyberjaya, MALAYSIA 1 [email protected]
ASSOCIATION RULE MINING ON WEB LOGS FOR EXTRACTING INTERESTING PATTERNS THROUGH WEKA TOOL
International Journal Of Advanced Technology In Engineering And Science Www.Ijates.Com Volume No 03, Special Issue No. 01, February 2015 ISSN (Online): 2348 7550 ASSOCIATION RULE MINING ON WEB LOGS FOR
Representing and Solving Rule-Based Decision Models with Constraint Solvers
Representing and Solving Rule-Based Decision Models with Constraint Solvers Jacob Feldman OpenRules, Inc., 75 Chatsworth Ct., Edison, NJ 08820, USA [email protected] Abstract. This paper describes
Keywords Data mining, Classification Algorithm, Decision tree, J48, Random forest, Random tree, LMT, WEKA 3.7. Fig.1. Data mining techniques.
International Journal of Emerging Research in Management &Technology Research Article October 2015 Comparative Study of Various Decision Tree Classification Algorithm Using WEKA Purva Sewaiwar, Kamal Kant
Static Program Transformations for Efficient Software Model Checking
Static Program Transformations for Efficient Software Model Checking Shobha Vasudevan Jacob Abraham The University of Texas at Austin Dependable Systems Large and complex systems Software faults are major
Rule based Classification of BSE Stock Data with Data Mining
International Journal of Information Sciences and Application. ISSN 0974-2255 Volume 4, Number 1 (2012), pp. 1-9 International Research Publication House http://www.irphouse.com Rule based Classification
A Serial Partitioning Approach to Scaling Graph-Based Knowledge Discovery
A Serial Partitioning Approach to Scaling Graph-Based Knowledge Discovery Runu Rathi, Diane J. Cook, Lawrence B. Holder Department of Computer Science and Engineering The University of Texas at Arlington
The Road from Software Testing to Theorem Proving
The Road from Software Testing to Theorem Proving A Short Compendium of my Favorite Software Verification Techniques Frédéric Painchaud DRDC Valcartier / Robustness and Software Analysis Group December
Roulette Sampling for Cost-Sensitive Learning
Roulette Sampling for Cost-Sensitive Learning Victor S. Sheng and Charles X. Ling Department of Computer Science, University of Western Ontario, London, Ontario, Canada N6A 5B7 {ssheng,cling}@csd.uwo.ca
Language Evaluation Criteria. Evaluation Criteria: Readability. Evaluation Criteria: Writability. ICOM 4036 Programming Languages
ICOM 4036 Programming Languages Preliminaries Dr. Amirhossein Chinaei Dept. of Electrical & Computer Engineering UPRM Spring 2010 Language Evaluation Criteria Readability: the ease with which programs
Introduction to Service Oriented Architectures (SOA)
Introduction to Service Oriented Architectures (SOA) Responsible Institutions: ETHZ (Concept) ETHZ (Overall) ETHZ (Revision) http://www.eu-orchestra.org - Version from: 26.10.2007 1 Content 1. Introduction
1 INTRODUCTION TO SYSTEM ANALYSIS AND DESIGN
1 INTRODUCTION TO SYSTEM ANALYSIS AND DESIGN 1.1 INTRODUCTION Systems are created to solve problems. One can think of the systems approach as an organized way of dealing with a problem. In this dynamic
Extending Semantic Resolution via Automated Model Building: applications
Extending Semantic Resolution via Automated Model Building: applications Ricardo Caferra Nicolas Peltier LIFIA-IMAG L1F1A-IMAG 46, Avenue Felix Viallet 46, Avenue Felix Viallei 38031 Grenoble Cedex FRANCE
TATJA: A Test Automation Tool for Java Applets
TATJA: A Test Automation Tool for Java Applets Matthew Xuereb 19, Sanctuary Street, San Ġwann [email protected] Abstract Although there are some very good tools to test Web Applications, such tools neglect
Mining the Software Change Repository of a Legacy Telephony System
Mining the Software Change Repository of a Legacy Telephony System Jelber Sayyad Shirabad, Timothy C. Lethbridge, Stan Matwin School of Information Technology and Engineering University of Ottawa, Ottawa,
Open Source Software: How Can Design Metrics Facilitate Architecture Recovery?
Open Source Software: How Can Design Metrics Facilitate Architecture Recovery? Eleni Constantinou 1, George Kakarontzas 2, and Ioannis Stamelos 1 1 Computer Science Department Aristotle University of Thessaloniki
The ProB Animator and Model Checker for B
The ProB Animator and Model Checker for B A Tool Description Michael Leuschel and Michael Butler Department of Electronics and Computer Science University of Southampton Highfield, Southampton, SO17 1BJ,
Feature vs. Classifier Fusion for Predictive Data Mining a Case Study in Pesticide Classification
Feature vs. Classifier Fusion for Predictive Data Mining a Case Study in Pesticide Classification Henrik Boström School of Humanities and Informatics University of Skövde P.O. Box 408, SE-541 28 Skövde
Analysis Tools and Libraries for BigData
+ Analysis Tools and Libraries for BigData Lecture 02 Abhijit Bendale + Office Hours 2 n Terry Boult (Waiting to Confirm) n Abhijit Bendale (Tue 2:45 to 4:45 pm). Best if you email me in advance, but I
What is a programming language?
Overview Introduction Motivation Why study programming languages? Some key concepts What is a programming language? Artificial language" Computers" Programs" Syntax" Semantics" What is a programming language?...there
Studying Auto Insurance Data
Studying Auto Insurance Data Ashutosh Nandeshwar February 23, 2010 1 Introduction To study auto insurance data using traditional and non-traditional tools, I downloaded a well-studied data from http://www.statsci.org/data/general/motorins.
1. Overview of the Java Language
1. Overview of the Java Language What Is the Java Technology? Java technology is: A programming language A development environment An application environment A deployment environment It is similar in syntax
A Framework for the Delivery of Personalized Adaptive Content
A Framework for the Delivery of Personalized Adaptive Content Colm Howlin CCKF Limited Dublin, Ireland [email protected] Danny Lynch CCKF Limited Dublin, Ireland [email protected] Abstract
Introducing diversity among the models of multi-label classification ensemble
Introducing diversity among the models of multi-label classification ensemble Lena Chekina, Lior Rokach and Bracha Shapira Ben-Gurion University of the Negev Dept. of Information Systems Engineering and
PhoCA: An extensible service-oriented tool for Photo Clustering Analysis
paper:5 PhoCA: An extensible service-oriented tool for Photo Clustering Analysis Yuri A. Lacerda 1,2, Johny M. da Silva 2, Leandro B. Marinho 1, Cláudio de S. Baptista 1 1 Laboratório de Sistemas de Informação
Advanced analytics at your hands
2.3 Advanced analytics at your hands Neural Designer is the most powerful predictive analytics software. It uses innovative neural networks techniques to provide data scientists with results in a way previously
Horizontal Aggregations In SQL To Generate Data Sets For Data Mining Analysis In An Optimized Manner
24 Horizontal Aggregations In SQL To Generate Data Sets For Data Mining Analysis In An Optimized Manner Rekha S. Nyaykhor M. Tech, Dept. Of CSE, Priyadarshini Bhagwati College of Engineering, Nagpur, India
Experiments in Web Page Classification for Semantic Web
Experiments in Web Page Classification for Semantic Web Asad Satti, Nick Cercone, Vlado Kešelj Faculty of Computer Science, Dalhousie University E-mail: {rashid,nick,vlado}@cs.dal.ca Abstract We address
Benchmarking Open-Source Tree Learners in R/RWeka
Benchmarking Open-Source Tree Learners in R/RWeka Michael Schauerhuber 1, Achim Zeileis 1, David Meyer 2, Kurt Hornik 1 Department of Statistics and Mathematics 1 Institute for Management Information Systems
KnowledgeSTUDIO HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES
HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES Translating data into business value requires the right data mining and modeling techniques which uncover important patterns within
A Mind Map Based Framework for Automated Software Log File Analysis
2011 International Conference on Software and Computer Applications IPCSIT vol.9 (2011) (2011) IACSIT Press, Singapore A Mind Map Based Framework for Automated Software Log File Analysis Dileepa Jayathilake
ML for the Working Programmer
ML for the Working Programmer 2nd edition Lawrence C. Paulson University of Cambridge CAMBRIDGE UNIVERSITY PRESS CONTENTS Preface to the Second Edition Preface xiii xv 1 Standard ML 1 Functional Programming
Analyzing PETs on Imbalanced Datasets When Training and Testing Class Distributions Differ
Analyzing PETs on Imbalanced Datasets When Training and Testing Class Distributions Differ David Cieslak and Nitesh Chawla University of Notre Dame, Notre Dame IN 46556, USA {dcieslak,nchawla}@cse.nd.edu
Efficiency Considerations of PERL and Python in Distributed Processing
Efficiency Considerations of PERL and Python in Distributed Processing Roger Eggen (presenter) Computer and Information Sciences University of North Florida Jacksonville, FL 32224 [email protected] 904.620.1326
Learning is a very general term denoting the way in which agents:
What is learning? Learning is a very general term denoting the way in which agents: Acquire and organize knowledge (by building, modifying and organizing internal representations of some external reality);
2. Distributed Handwriting Recognition. Abstract. 1. Introduction
XPEN: An XML Based Format for Distributed Online Handwriting Recognition A.P.Lenaghan, R.R.Malyan, School of Computing and Information Systems, Kingston University, UK {a.lenaghan,r.malyan}@kingston.ac.uk
Prof. Pietro Ducange Students Tutor and Practical Classes Course of Business Intelligence 2014 http://www.iet.unipi.it/p.ducange/esercitazionibi/
Prof. Pietro Ducange Students Tutor and Practical Classes Course of Business Intelligence 2014 http://www.iet.unipi.it/p.ducange/esercitazionibi/ Email: [email protected] Office: Dipartimento di Ingegneria
Diploma Of Computing
Diploma Of Computing Course Outline Campus Intake CRICOS Course Duration Teaching Methods Assessment Course Structure Units Melbourne Burwood Campus / Jakarta Campus, Indonesia March, June, October 022638B
Horizontal Aggregations in SQL to Prepare Data Sets for Data Mining Analysis
IOSR Journal of Computer Engineering (IOSRJCE) ISSN: 2278-0661, ISBN: 2278-8727 Volume 6, Issue 5 (Nov. - Dec. 2012), PP 36-41 Horizontal Aggregations in SQL to Prepare Data Sets for Data Mining Analysis
Results in Data Mining for Adaptive System Management
Results in Data Mining for Adaptive System Management Arno J. Knobbe, Bart Marseille, Otto Moerbeek, Daniël M.G. van der Wallen Postbus 2729 3800 GG Amersfoort The Netherlands {a.knobbe, b.marseille, o.moerbeek,
Fundamentals of Java Programming
Fundamentals of Java Programming This document is exclusive property of Cisco Systems, Inc. Permission is granted to print and copy this document for non-commercial distribution and exclusive use by instructors
KLMLean 2.0: A Theorem Prover for KLM Logics of Nonmonotonic Reasoning
KLMLean 2.0: A Theorem Prover for KLM Logics of Nonmonotonic Reasoning Laura Giordano 1, Valentina Gliozzi 2, and Gian Luca Pozzato 2 1 Dipartimento di Informatica - Università del Piemonte Orientale A.
Java 6 'th. Concepts INTERNATIONAL STUDENT VERSION. edition
Java 6 'th edition Concepts INTERNATIONAL STUDENT VERSION CONTENTS PREFACE vii SPECIAL FEATURES xxviii chapter i INTRODUCTION 1 1.1 What Is Programming? 2 J.2 The Anatomy of a Computer 3 1.3 Translating
DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7
DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7 UNDER THE GUIDANCE Dr. N.P. DHAVALE, DGM, INFINET Department SUBMITTED TO INSTITUTE FOR DEVELOPMENT AND RESEARCH IN BANKING TECHNOLOGY
72. Ontology Driven Knowledge Discovery Process: a proposal to integrate Ontology Engineering and KDD
72. Ontology Driven Knowledge Discovery Process: a proposal to integrate Ontology Engineering and KDD Paulo Gottgtroy Auckland University of Technology [email protected] Abstract This paper is
A Database Re-engineering Workbench
A Database Re-engineering Workbench A project proposal by Anmol Sharma Abstract Data is not always available in the best form for processing, it is often provided in poor format or in a poor quality data
UML-based Test Generation and Execution
UML-based Test Generation and Execution Jean Hartmann, Marlon Vieira, Herb Foster, Axel Ruder Siemens Corporate Research, Inc. 755 College Road East Princeton NJ 08540, USA [email protected] ABSTRACT
Course Syllabus For Operations Management. Management Information Systems
For Operations Management and Management Information Systems Department School Year First Year First Year First Year Second year Second year Second year Third year Third year Third year Third year Third
Introduction to Fuzzy Control
Introduction to Fuzzy Control Marcelo Godoy Simoes Colorado School of Mines Engineering Division 1610 Illinois Street Golden, Colorado 80401-1887 USA Abstract In the last few years the applications of
Masters in Computing and Information Technology
Masters in Computing and Information Technology Programme Requirements Taught Element, and PG Diploma in Computing and Information Technology: 120 credits: IS5101 CS5001 or CS5002 CS5003 up to 30 credits
A Virtual Machine Searching Method in Networks using a Vector Space Model and Routing Table Tree Architecture
A Virtual Machine Searching Method in Networks using a Vector Space Model and Routing Table Tree Architecture Hyeon seok O, Namgi Kim1, Byoung-Dai Lee dept. of Computer Science. Kyonggi University, Suwon,
Component visualization methods for large legacy software in C/C++
Annales Mathematicae et Informaticae 44 (2015) pp. 23 33 http://ami.ektf.hu Component visualization methods for large legacy software in C/C++ Máté Cserép a, Dániel Krupp b a Eötvös Loránd University [email protected]
Experiences with Online Programming Examinations
Experiences with Online Programming Examinations Monica Farrow and Peter King School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh EH14 4AS Abstract An online programming examination
Selective Naive Bayes Regressor with Variable Construction for Predictive Web Analytics
Selective Naive Bayes Regressor with Variable Construction for Predictive Web Analytics Boullé Orange Labs avenue Pierre Marzin 3 Lannion, France [email protected] ABSTRACT We describe our submission
Termination Checking: Comparing Structural Recursion and Sized Types by Examples
Termination Checking: Comparing Structural Recursion and Sized Types by Examples David Thibodeau Decemer 3, 2011 Abstract Termination is an important property for programs and is necessary for formal proofs
White Paper: 5GL RAD Development
White Paper: 5GL RAD Development After 2.5 hours of training, subjects reduced their development time by 60-90% A Study By: 326 Market Street Harrisburg, PA 17101 Luis Paris, Ph.D. Associate Professor
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,
Data Quality Mining: Employing Classifiers for Assuring consistent Datasets
Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Fabian Grüning Carl von Ossietzky Universität Oldenburg, Germany, [email protected] Abstract: Independent
Application Design: Issues in Expert System Architecture. Harry C. Reinstein Janice S. Aikins
Application Design: Issues in Expert System Architecture Harry C. Reinstein Janice S. Aikins IBM Scientific Center 15 30 Page Mill Road P. 0. Box 10500 Palo Alto, Ca. 94 304 USA ABSTRACT We describe an
Towards a Framework for Generating Tests to Satisfy Complex Code Coverage in Java Pathfinder
Towards a Framework for Generating Tests to Satisfy Complex Code Coverage in Java Pathfinder Matt Department of Computer Science and Engineering University of Minnesota [email protected] Abstract We present
LiDDM: A Data Mining System for Linked Data
LiDDM: A Data Mining System for Linked Data Venkata Narasimha Pavan Kappara Indian Institute of Information Technology Allahabad Allahabad, India [email protected] Ryutaro Ichise National Institute of
Domains and Competencies
Domains and Competencies DOMAIN I TECHNOLOGY APPLICATIONS CORE Standards Assessed: Computer Science 8 12 I VII Competency 001: The computer science teacher knows technology terminology and concepts; the
Technology WHITE PAPER
Technology WHITE PAPER What We Do Neota Logic builds software with which the knowledge of experts can be delivered in an operationally useful form as applications embedded in business systems or consulted
Lappoon R. Tang, Assistant Professor, University of Texas at Brownsville, [email protected]
Volume 2, Issue 1, 2008 Using a Machine Learning Approach for Building Natural Language Interfaces for Databases: Application of Advanced Techniques in Inductive Logic Programming Lappoon R. Tang, Assistant
Simplifying e Business Collaboration by providing a Semantic Mapping Platform
Simplifying e Business Collaboration by providing a Semantic Mapping Platform Abels, Sven 1 ; Sheikhhasan Hamzeh 1 ; Cranner, Paul 2 1 TIE Nederland BV, 1119 PS Amsterdam, Netherlands 2 University of Sunderland,
The Optimality of Naive Bayes
The Optimality of Naive Bayes Harry Zhang Faculty of Computer Science University of New Brunswick Fredericton, New Brunswick, Canada email: hzhang@unbca E3B 5A3 Abstract Naive Bayes is one of the most
COURSE RECOMMENDER SYSTEM IN E-LEARNING
International Journal of Computer Science and Communication Vol. 3, No. 1, January-June 2012, pp. 159-164 COURSE RECOMMENDER SYSTEM IN E-LEARNING Sunita B Aher 1, Lobo L.M.R.J. 2 1 M.E. (CSE)-II, Walchand
A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier
A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier G.T. Prasanna Kumari Associate Professor, Dept of Computer Science and Engineering, Gokula Krishna College of Engg, Sullurpet-524121,
An Evaluation of Machine Learning Method for Intrusion Detection System Using LOF on Jubatus
An Evaluation of Machine Learning Method for Intrusion Detection System Using LOF on Jubatus Tadashi Ogino* Okinawa National College of Technology, Okinawa, Japan. * Corresponding author. Email: [email protected]
From Big Data to Smart Data
Int'l Conf. Frontiers in Education: CS and CE FECS'15 75 From Big Data to Smart Data Teaching Data Mining and Visualization Antonio Sanchez, Lisa Burnell Ball Department of Computer Science Texas Christian
Scenario: Optimization of Conference Schedule.
MINI PROJECT 1 Scenario: Optimization of Conference Schedule. A conference has n papers accepted. Our job is to organize them in a best possible schedule. The schedule has p parallel sessions at a given
On the Implementation of an ILP System with Prolog
On the Implementation of an ILP System with Prolog Nuno Fonseca, Vitor Santos Costa, Fernando Silva, Rui Camacho Technical Report Series: DCC-2003-03 Departamento de Ciência de Computadores Faculdade de
An Introduction to SAS Enterprise Miner and SAS Forecast Server. André de Waal, Ph.D. Analytical Consultant
SAS Analytics Day An Introduction to SAS Enterprise Miner and SAS Forecast Server André de Waal, Ph.D. Analytical Consultant Agenda 1. Introduction to SAS Enterprise Miner 2. Basics 3. Enterprise Miner
GSiB: PSE Infrastructure for Dynamic Service-oriented Grid Applications
GSiB: PSE Infrastructure for Dynamic Service-oriented Grid Applications Yan Huang Department of Computer Science Cardiff University PO Box 916 Cardiff CF24 3XF United Kingdom [email protected]
Extension of Decision Tree Algorithm for Stream Data Mining Using Real Data
Fifth International Workshop on Computational Intelligence & Applications IEEE SMC Hiroshima Chapter, Hiroshima University, Japan, November 10, 11 & 12, 2009 Extension of Decision Tree Algorithm for Stream
