Naive Rule Induction for Text Classification based on Key-phrases
|
|
- Alberta Fowler
- 7 years ago
- Views:
Transcription
1 Nave Rule Inducton for Text Classfcaton based on Key-phrases Nktas N. Karankolas & Chrstos Skourlas Department of Informatcs, Technologcal Educatonal Insttute of Athens, Greece. Abstract In ths paper, we focus on the nducton of nave rules for classfyng text documents. An algorthm s brefly descrbed for the creaton of key-phrases from a gven set of documents and these key-phrases are organzed and used as features for the automatc classfcaton of documents. An Authorty lst of key-phrases s specfed by the algorthm contanng key-phrases that are frequent wthn the documents of only one or few classes n the tranng set. In ths framework, ths last property permtted us the creaton of nave rules that measure the smlarty of documents wth the exstng classes. Keywords: text data mnng, text classfcaton, nstance based learnng, rule nducton. Introducton Key-phrases or search terms could be defned as sequences of adjacent words wthn a text wndow (e.g. fve successve words of the text / a sentence) formng a meanngful, descrptve phrase related to the content of the text document. Such terms can be used as features for classfyng (text) documents. Snce, not every key-phrase s approprate for dscrmnatng between documents, we have to examne and apply methods for selectng the approprate ones. Hence, a prerequste for such a classfcaton method s the use and mantenance of a lst of key-phrases, the so-called Authorty Lst Karankolas et al [4]. An nterestng problem s related to the reducton of the search space that s needed for the extracton of canddate key-phrases. In Classfcaton learnng, a learnng scheme takes a set of classfed examples from whch t s expected to learn a way of classfyng unseen
2 examples. In Instance-based learnng methods we start wth a partcular nstance (example) and see how t can be generalzed to cover other nearby nstances (examples) n the same class Wtten et al [0]. Therefore, Instance based learnng methods seems to be a novel way for classfyng documents. In such a method a smlarty measure adapted from the doman of Text Retreval (Kowalsk [7], Deal []) can be used to evaluate how close to the exstng classfed documents s a, under classfcaton, document.. Defntons - Notatons Every classfed document could be represented (or accompaned) by a vector of (m+) tems. The frst m tems are assgned to Boolean values, representng the exstence or not of a correspondng key-phrase n the document. It means that the elements of the Authorty lst of key-phrases are m or n other words that the number of key-phrases selected for classfcaton s m. In the last (m+)-tem of the vector s located the class, called the label, of the document. Havng a tranng set of labeled vectors we eventually form a table wth m+ attrbutes (columns). Thus, data mnng algorthms (or knowledge dscovery) can be easly appled to such a representaton of documents, n order to extract rules or trees for classfyng documents Karankolas et al [4]. 2 Materals and Methods In the proposed classfcaton scheme we use an authorty lst of key-phrases and a set of classfed documents and try to apply a smlarty measure whch offers a way of classfyng (unclassfed) documents. Hence, n our work, Text Classfcaton s the task of applyng (sem) automatc methods n order to select the approprate class for a gven document, between a set of predefned ones (classes). More precsely, our nstance based learnng approach for the classfcaton of documents s based on the use of the key-phrases and the related concept of the smlarty between exstng documents (the tranng set) and the (under classfcaton) documents Karankolas et al [5]. The method assumes that smlar documents must be classfed n the same category,.e. must share the same classfcaton code. It remans to select the features of documents that must be used for measurng ther smlarty. In our approach these features are key-phrases that exst n the documents of the tranng set and are extracted from the Authorty Lst. The representaton of our documents s a vector wth (m+2) attrbutes. The frst attrbute contans the document. Each of the next m attrbutes s assgned to a Boolean value that represents the exstence or not of a correspondng key-phrase n the document. The last tem of the vector s the classfcaton code (the label) of document. The followng fgure (fg. ) depcts the vector and ts relatonshp wth the lst of key-phrases and the lst of avalable classes. Ths approach explots a well-known approach n the doman of Informaton Retreval n order to defne a measure of smlarty between the document
3 and the documents of the tranng set. Our proposal for a smlarty measure s an adaptaton of the cosne measure (see for example Lucarella [8]). Authorty Lst of Key-Phrases K K 2 K 3 K 4 K m Doc. ID T)rue F)alse T F F F T CC s CC CC 2 CC 3 CC 4 CC s CC n Classfcaton Codes (CC), T= True, F= False Fgure : The representaton of our documents vectors and ts relatonshp wth the lst of key-phrases and the lst of avalable classes The followng equaton calculates the smlarty: m q k m j j j= j= S( D, D ) = = () LD LD k m m 2 q j j= j= 2 j where m s the number of key-phrases used n the collecton, k j s equal to f the key-phrase j exsts n document D (of the tranng set), otherwse s equal to 0 and q j s the weght of key-phrase j n the document. The followng equaton can be used to measure the term q j : ClassCount q j = log 2 ClassFreq j where ClassCount s the number of classes of the tranng set, and ClassFreq j s the number of classes that nclude the key-phrase j. Selectng key-phrases (Georgantopoulos et al [3], Turney [9]) that wll be used as attrbutes for text classfcaton s a very mportant step n our approach. The choce of the key-phrases has not to be based on frequent (wthn the whole text collecton) canddate phrases. We must also avod choosng key-phrases that exst n a few texts of the collecton (as a whole) but are qute frequent wthn these texts Karankolas et al [6]. The creaton of an Authorty Lst and ts use for document classfcaton must be based on the selecton of key-phrases that are frequent wthn the documents of only one or few classes n the tranng set. q j k j
4 2. Phrase Extracton We formalze the problem of phrase extracton for classfcaton n the followng way. Gven a collecton of documents subdvded nto classes, a wndow wdth and a frequency threshold, fnd all key-phrases that occur frequently enough n only one or few classes. The followng algorthm can solve the problem of extractng the approprate key-phrases and has two alternatng phases: buldng canddate key-phrases, and evaluatng how often these ones occur n a class of the collecton. Algorthm For every class (CL ) of the tranng set do 2 For every document of the class (DCL ) do 3 Stemmng 4 stop word removal 5 End {For every document of the class} 6 Choose the most frequent stems of the class (P0 - parameter) 7 Form the canddate double word phrases (C 2 ) from the frequent stems (L ) 8 Choose the most frequent double word phrases (L 2 ) (W and P - parameters) 9 For x=3,4 do 0 Form the canddate x - wdth word phrases (C x ) from the frequent (x-) wdth word phrases (L x- ) Choose the most frequent x wdth word phrases (L x ) (P2 and W2, P3 and W3 parameters) 2 End {For x=3,4 do} 3 Compose an ntegrated lst by jonng L x (for x=2,3,4). Ths jon, forms the frequent word phrases of class (FCL ) 4 End {For every class of the tranng set} 5 Integrate / Jon the lsts of frequent word phrases of all classes of the tranng set 6 Reject the frequent word phrases that exst n many classes (Pt - parameter). The rest of the frequent word phrases form the set of key-phrases or «Authorty lst» 7 Form the dctonary of «Terms». It s the lst of stems that are components of the key-phrases of the «Authorty lst». Where Parameter P0 s the percentage of texts of the class that must contan a stem, W s the wdth of wndow that covers 2-word phrases, P s the percentage of texts of the class that must contan a 2-words phrase, etc. Pt s the percentage of classes that can contan a key-phrase. The way that step0 forms the canddate x-wdth word phrases (C x ) from the frequent (x-) wdth word phrases (L x- ) s an nterestng part of our algorthm. The effcency of the algorthm s based on the followng estmaton: Potentally, a very large number of canddate key-phrases has to be checked. Hence, we can reduce the search space by buldng larger key-phrases from smaller ones. In
5 other words, t s only necessary to test the occurrences of key-phrases whose all sub-(key-phrases) are frequent. 3 Nave Rules nducton CL : { }, 2 In ths secton we focus on some key ponts for the nducton of nave rules that measure the smlarty of a, unclassfed, document wth the pre-defned classes of documents. There s an nterestng property ncorporated nto the above Key-phrase Extracton algorthm: The Authorty lst created by the algorthm contans keyphrases that are frequent wthn the documents of only one or few classes. Consequently, for each class we can have a lst of representatve key-phrases. A framework for extractng nave rules can be the followng one: The smlarty of a document wth the elements of some class CL s estmated by the number of key-phrases from the lst of the representatve ones for the class CL. More precsely, the calculaton of smlarty s equal to the number of tems of CL class representatve key-phrases lst that actually exst n the document, dvded (normalzed) by the populaton of the lst. The followng notaton denotes the lst of representatve key-phrases of class RCL = kpcl kpcl,..., kpcl, where kpcl j s the j-(keyphrase) of the representatve key-phrases of class CL. The measure RCL defnes the number of the representatve key-phrases for the specfed class CL. A smple framework for extractng nave rules that measure the smlarty of a, unclassfed, document wth the class CL s defned by the equaton: ' S ( D, CL ) count( exst( kpcl, D RCL j r = (2a) where the operaton (functon) exsts denotes f a representatve key-phrase kpcl j exsts n the document and counts means the calculaton of the exstng key-phrases. If we want to proceed nductng more complcated rules we can use weghts based on the number of words (stems) that consttute a key-phrase. ' S ( D, CL ) = RCL j= count( conttuent( exst( kpcl, D RCL j= )) count( consttuent( kpcl )) (2b) Thus, gven a document we can measure ts smlarty wth every class, usng formula (2a) or (2b), and suggest the most plausble classes,.e. the classes that have the greater smlarty wth the document. In other words, we have to apply the smlarty measures S ' ( D, CL ) for any avalable class and choose the class that gets the hghest value. On the contrary, an nstance based learnng approach needs the applcaton of the formula for the calculaton j j ))) CL
6 of the smlarty S D, D ) as many tmes as the sze of the tranng set. ( Moreover, n order to select the most promsng class, we have to apply a second step that fnds the average smlarty of the document wth the documents of each class of the tranng set. Ths can be obtaned wth the followng formula: '' S ( D, CL ) = D j DCL S( D, D j DCL (3) where DCL s the subset of the tranng s set documents that are pre-classfed as members of class CL. Example Let say that we have a tranng set that contans a thousand (000) documents pre-classfed n eght (8) classes. In order to select the most relevant class for a document, wth the usage of the nstance based learnng approach, we have to: apply formula (), between every document of the tranng set and the unclassfed document,.e. a thousand of tmes apply formula (3), to get the average smlarty between each class and the unclassfed document,.e. eght tmes On the contrary, the applcaton of nave rule (2a), eght (8) tmes s the only step needed to select the most relevant class for the document. Thus, the nave rule nducton based approach s smple and could be easly appled n varous applcatons. On the other hand, the nstance based learnng approach can be appled only when the tranng set s not a very extended one. 4 Conclusons We have presented methods for (sem) automatc classfcaton of documents based on nstance based learnng and nave rule nducton. Our approach s based on the use of key-phrases from a controlled lst (Authorty Lst). The creaton and mantenance of the key-phrase s (authorty) lst s an mportant subject n the doman of document classfcaton. Our algorthm for key-phrase selecton permts the nducton of nave rules for classfcaton. There s an nterestng property n our algorthm, for selectng the approprate keyphrases for classfcaton that permtted us to nvestgate a nave rule nducton for classfcaton, as an alternatve to the classcal data mnng algorthms. The Authorty lst s created by the proposed algorthm and contans key-phrases that are frequent wthn the documents of only one or few classes n the tranng set. Consequently, for each class we can have a lst of representatve key-phrases. A framework for nave rules nducton s based on a measure of the smlarty between a document and the representatve key-phrases of some class CL. Hence, f a document s gven we can measure ts smlarty wth every class and suggest the most plausble classes,.e. the classes that have the )
7 greater smlarty wth the document. It s also possble to extract more complcated rules usng weghts e.g. weghts based on the number of words (stems) that consttute a key-phrase. Acknowledgements Ths work was co-funded by 75% from E.E. and 25% from the Greek Government under the framework of the Educaton and Intal Vocatonal Tranng Program Archmedes. References [] Deal D.C., Technques of Document Management: A revew of Text Retreval and related technologes. Journal of Documentaton, 57, pp , 200. [2] Frank Ebe, et al. Doman-Specfc Key-phrase Extracton. Internatonal Jont Conference of Artfcal Intellgence, 999. [3] Georgantopoulos B. & Pperds S. Automatc Term Extracton Based on Pattern Grammars. LogoNavgaton, Issue 5, May 999. [4] Karankolas N.N. & Skourlas C. Automatc Dagnoss Classfcaton of patent ds charge letters. MIE 2002: XVIIth Internatonal Congress of the European Federaton for Medcal Informatcs, Budapest, August, IOS Press, ISBN: , ISSN: [5] Karankolas N.N., Skourlas C., Chrstopoulou A. & Alevzos T. Medcal Text Classfcaton based on Text Retreval technques. MEDINF st Internatonal Conference on Medcal Informatcs & Engneerng, October 9 -, 2003, Craova, Romana. [6] Karankolas N.N. & Skourlas C. Key-Phrase Extracton for Classfcaton. MEDICON and HEALTH TELEMATICS X Medterranean Conference on Medcal and Bologcal Engneerng, July 3 - August 5, 2004, Ischa, Italy. [7] Kowalsk G. Informaton Retreval Systems. Theory and Implementaton, Kluwer Academc Publshers, st edn, 997, ISBN [8] Lucarella, D., A document retreval system based on nearest neghbour searchng, Journal of Informaton Scence, 4, pp , 988. [9] Turney P. Extracton of Key-phrases from Text: Evaluaton of Four Algorthms. Natonal Research Councl of Canada, Techncal Report ERB- 05, October 23, 997. [0] Wtten Ian & Frank Ebe, Data Mnng: Practcal Machne Learnng tools and Technques wth Java mplementaton, Morgan Kaufmann, 999, ISBN:
Calculation of Sampling Weights
Perre Foy Statstcs Canada 4 Calculaton of Samplng Weghts 4.1 OVERVIEW The basc sample desgn used n TIMSS Populatons 1 and 2 was a two-stage stratfed cluster desgn. 1 The frst stage conssted of a sample
More informationFREQUENCY OF OCCURRENCE OF CERTAIN CHEMICAL CLASSES OF GSR FROM VARIOUS AMMUNITION TYPES
FREQUENCY OF OCCURRENCE OF CERTAIN CHEMICAL CLASSES OF GSR FROM VARIOUS AMMUNITION TYPES Zuzanna BRO EK-MUCHA, Grzegorz ZADORA, 2 Insttute of Forensc Research, Cracow, Poland 2 Faculty of Chemstry, Jagellonan
More informationWhat is Candidate Sampling
What s Canddate Samplng Say we have a multclass or mult label problem where each tranng example ( x, T ) conssts of a context x a small (mult)set of target classes T out of a large unverse L of possble
More information8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by
6 CHAPTER 8 COMPLEX VECTOR SPACES 5. Fnd the kernel of the lnear transformaton gven n Exercse 5. In Exercses 55 and 56, fnd the mage of v, for the ndcated composton, where and are gven by the followng
More informationForecasting the Direction and Strength of Stock Market Movement
Forecastng the Drecton and Strength of Stock Market Movement Jngwe Chen Mng Chen Nan Ye cjngwe@stanford.edu mchen5@stanford.edu nanye@stanford.edu Abstract - Stock market s one of the most complcated systems
More informationProject Networks With Mixed-Time Constraints
Project Networs Wth Mxed-Tme Constrants L Caccetta and B Wattananon Western Australan Centre of Excellence n Industral Optmsaton (WACEIO) Curtn Unversty of Technology GPO Box U1987 Perth Western Australa
More informationConversion between the vector and raster data structures using Fuzzy Geographical Entities
Converson between the vector and raster data structures usng Fuzzy Geographcal Enttes Cdála Fonte Department of Mathematcs Faculty of Scences and Technology Unversty of Combra, Apartado 38, 3 454 Combra,
More informationThe Greedy Method. Introduction. 0/1 Knapsack Problem
The Greedy Method Introducton We have completed data structures. We now are gong to look at algorthm desgn methods. Often we are lookng at optmzaton problems whose performance s exponental. For an optmzaton
More informationLuby s Alg. for Maximal Independent Sets using Pairwise Independence
Lecture Notes for Randomzed Algorthms Luby s Alg. for Maxmal Independent Sets usng Parwse Independence Last Updated by Erc Vgoda on February, 006 8. Maxmal Independent Sets For a graph G = (V, E), an ndependent
More informationUsing Supervised Clustering Technique to Classify Received Messages in 137 Call Center of Tehran City Council
Usng Supervsed Clusterng Technque to Classfy Receved Messages n 137 Call Center of Tehran Cty Councl Mahdyeh Haghr 1*, Hamd Hassanpour 2 (1) Informaton Technology engneerng/e-commerce, Shraz Unversty (2)
More informationThe Development of Web Log Mining Based on Improve-K-Means Clustering Analysis
The Development of Web Log Mnng Based on Improve-K-Means Clusterng Analyss TngZhong Wang * College of Informaton Technology, Luoyang Normal Unversty, Luoyang, 471022, Chna wangtngzhong2@sna.cn Abstract.
More informationLecture 2: Single Layer Perceptrons Kevin Swingler
Lecture 2: Sngle Layer Perceptrons Kevn Sngler kms@cs.str.ac.uk Recap: McCulloch-Ptts Neuron Ths vastly smplfed model of real neurons s also knon as a Threshold Logc Unt: W 2 A Y 3 n W n. A set of synapses
More informationSearching for Interacting Features for Spam Filtering
Searchng for Interactng Features for Spam Flterng Chuanlang Chen 1, Yun-Chao Gong 2, Rongfang Be 1,, and X. Z. Gao 3 1 Department of Computer Scence, Bejng Normal Unversty, Bejng 100875, Chna 2 Software
More informationDesign of Output Codes for Fast Covering Learning using Basic Decomposition Techniques
Journal of Computer Scence (7): 565-57, 6 ISSN 59-66 6 Scence Publcatons Desgn of Output Codes for Fast Coverng Learnng usng Basc Decomposton Technques Aruna Twar and Narendra S. Chaudhar, Faculty of Computer
More informationForecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network
700 Proceedngs of the 8th Internatonal Conference on Innovaton & Management Forecastng the Demand of Emergency Supples: Based on the CBR Theory and BP Neural Network Fu Deqang, Lu Yun, L Changbng School
More information1 Example 1: Axis-aligned rectangles
COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 6 Scrbe: Aaron Schld February 21, 2013 Last class, we dscussed an analogue for Occam s Razor for nfnte hypothess spaces that, n conjuncton
More informationSelecting Best Employee of the Year Using Analytical Hierarchy Process
J. Basc. Appl. Sc. Res., 5(11)72-76, 2015 2015, TextRoad Publcaton ISSN 2090-4304 Journal of Basc and Appled Scentfc Research www.textroad.com Selectng Best Employee of the Year Usng Analytcal Herarchy
More informationHow To Understand The Results Of The German Meris Cloud And Water Vapour Product
Ttel: Project: Doc. No.: MERIS level 3 cloud and water vapour products MAPP MAPP-ATBD-ClWVL3 Issue: 1 Revson: 0 Date: 9.12.1998 Functon Name Organsaton Sgnature Date Author: Bennartz FUB Preusker FUB Schüller
More informationModule 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur
Module LOSSLESS IMAGE COMPRESSION SYSTEMS Lesson 3 Lossless Compresson: Huffman Codng Instructonal Objectves At the end of ths lesson, the students should be able to:. Defne and measure source entropy..
More information1. Measuring association using correlation and regression
How to measure assocaton I: Correlaton. 1. Measurng assocaton usng correlaton and regresson We often would lke to know how one varable, such as a mother's weght, s related to another varable, such as a
More informationThe OC Curve of Attribute Acceptance Plans
The OC Curve of Attrbute Acceptance Plans The Operatng Characterstc (OC) curve descrbes the probablty of acceptng a lot as a functon of the lot s qualty. Fgure 1 shows a typcal OC Curve. 10 8 6 4 1 3 4
More informationVision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION
Vson Mouse Saurabh Sarkar a* a Unversty of Cncnnat, Cncnnat, USA ABSTRACT The report dscusses a vson based approach towards trackng of eyes and fngers. The report descrbes the process of locatng the possble
More informationMachine Learning and Software Quality Prediction: As an Expert System
I.J. Informaton Engneerng and Electronc Busness, 2014, 2, 9-27 Publshed Onlne Aprl 2014 n MECS (http://www.mecs-press.org/) DOI: 10.5815/jeeb.2014.02.02 Machne Learnng and Software Qualty Predcton: As
More information1.1 The University may award Higher Doctorate degrees as specified from time-to-time in UPR AS11 1.
HIGHER DOCTORATE DEGREES SUMMARY OF PRINCIPAL CHANGES General changes None Secton 3.2 Refer to text (Amendments to verson 03.0, UPR AS02 are shown n talcs.) 1 INTRODUCTION 1.1 The Unversty may award Hgher
More informationImplementations of Web-based Recommender Systems Using Hybrid Methods
Internatonal Journal of Computer Scence & Applcatons Vol. 3 Issue 3, pp 52-64 2006 Technomathematcs Research Foundaton Implementatons of Web-based Recommender Systems Usng Hybrd Methods Janusz Sobeck Insttute
More informationPEER REVIEWER RECOMMENDATION IN ONLINE SOCIAL LEARNING CONTEXT: INTEGRATING INFORMATION OF LEARNERS AND SUBMISSIONS
PEER REVIEWER RECOMMENDATION IN ONLINE SOCIAL LEARNING CONTEXT: INTEGRATING INFORMATION OF LEARNERS AND SUBMISSIONS Yunhong Xu, Faculty of Management and Economcs, Kunmng Unversty of Scence and Technology,
More informationDescriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications
CMSC828G Prncples of Data Mnng Lecture #9 Today s Readng: HMS, chapter 9 Today s Lecture: Descrptve Modelng Clusterng Algorthms Descrptve Models model presents the man features of the data, a global summary
More informationA neuro-fuzzy collaborative filtering approach for Web recommendation. G. Castellano, A. M. Fanelli, and M. A. Torsello *
Internatonal Journal of Computatonal Scence 992-6669 (Prnt) 992-6677 (Onlne) Global Informaton Publsher 27, Vol., No., 27-39 A neuro-fuzzy collaboratve flterng approach for Web recommendaton G. Castellano,
More informationDocument Clustering Analysis Based on Hybrid PSO+K-means Algorithm
Document Clusterng Analyss Based on Hybrd PSO+K-means Algorthm Xaohu Cu, Thomas E. Potok Appled Software Engneerng Research Group, Computatonal Scences and Engneerng Dvson, Oak Rdge Natonal Laboratory,
More informationNEURO-FUZZY INFERENCE SYSTEM FOR E-COMMERCE WEBSITE EVALUATION
NEURO-FUZZY INFERENE SYSTEM FOR E-OMMERE WEBSITE EVALUATION Huan Lu, School of Software, Harbn Unversty of Scence and Technology, Harbn, hna Faculty of Appled Mathematcs and omputer Scence, Belarusan State
More informationA Simple Approach to Clustering in Excel
A Smple Approach to Clusterng n Excel Aravnd H Center for Computatonal Engneerng and Networng Amrta Vshwa Vdyapeetham, Combatore, Inda C Rajgopal Center for Computatonal Engneerng and Networng Amrta Vshwa
More informationAn Interest-Oriented Network Evolution Mechanism for Online Communities
An Interest-Orented Network Evoluton Mechansm for Onlne Communtes Cahong Sun and Xaopng Yang School of Informaton, Renmn Unversty of Chna, Bejng 100872, P.R. Chna {chsun,yang}@ruc.edu.cn Abstract. Onlne
More informationPlanning for Marketing Campaigns
Plannng for Marketng Campagns Qang Yang and Hong Cheng Department of Computer Scence Hong Kong Unversty of Scence and Technology Clearwater Bay, Kowloon, Hong Kong, Chna (qyang, csch)@cs.ust.hk Abstract
More informationFeature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College
Feature selecton for ntruson detecton Slobodan Petrovć NISlab, Gjøvk Unversty College Contents The feature selecton problem Intruson detecton Traffc features relevant for IDS The CFS measure The mrmr measure
More informationAn Inductive Fuzzy Classification Approach applied to Individual Marketing
An Inductve Fuzzy Classfcaton Approach appled to Indvdual Marketng Mchael Kaufmann, Andreas Meer Abstract A data mnng methodology for an nductve fuzzy classfcaton s ntroduced. The nducton step s based
More informationv a 1 b 1 i, a 2 b 2 i,..., a n b n i.
SECTION 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS 455 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS All the vector spaces we have studed thus far n the text are real vector spaces snce the scalars are
More informationCan Auto Liability Insurance Purchases Signal Risk Attitude?
Internatonal Journal of Busness and Economcs, 2011, Vol. 10, No. 2, 159-164 Can Auto Lablty Insurance Purchases Sgnal Rsk Atttude? Chu-Shu L Department of Internatonal Busness, Asa Unversty, Tawan Sheng-Chang
More informationMining Multiple Large Data Sources
The Internatonal Arab Journal of Informaton Technology, Vol. 7, No. 3, July 2 24 Mnng Multple Large Data Sources Anmesh Adhkar, Pralhad Ramachandrarao 2, Bhanu Prasad 3, and Jhml Adhkar 4 Department of
More informationWISE-Integrator: An Automatic Integrator of Web Search Interfaces for E-Commerce
WSE-ntegrator: An Automatc ntegrator of Web Search nterfaces for E-Commerce Ha He, Wey Meng Dept. of Computer Scence SUNY at Bnghamton Bnghamton, NY 13902 {hahe,meng}@cs.bnghamton.edu Clement Yu Dept.
More informationResearch on Transformation Engineering BOM into Manufacturing BOM Based on BOP
Appled Mechancs and Materals Vols 10-12 (2008) pp 99-103 Onlne avalable snce 2007/Dec/06 at wwwscentfcnet (2008) Trans Tech Publcatons, Swtzerland do:104028/wwwscentfcnet/amm10-1299 Research on Transformaton
More informationRecurrence. 1 Definitions and main statements
Recurrence 1 Defntons and man statements Let X n, n = 0, 1, 2,... be a MC wth the state space S = (1, 2,...), transton probabltes p j = P {X n+1 = j X n = }, and the transton matrx P = (p j ),j S def.
More informationA Secure Password-Authenticated Key Agreement Using Smart Cards
A Secure Password-Authentcated Key Agreement Usng Smart Cards Ka Chan 1, Wen-Chung Kuo 2 and Jn-Chou Cheng 3 1 Department of Computer and Informaton Scence, R.O.C. Mltary Academy, Kaohsung 83059, Tawan,
More informationWe are now ready to answer the question: What are the possible cardinalities for finite fields?
Chapter 3 Fnte felds We have seen, n the prevous chapters, some examples of fnte felds. For example, the resdue class rng Z/pZ (when p s a prme) forms a feld wth p elements whch may be dentfed wth the
More informationA Performance Analysis of View Maintenance Techniques for Data Warehouses
A Performance Analyss of Vew Mantenance Technques for Data Warehouses Xng Wang Dell Computer Corporaton Round Roc, Texas Le Gruenwald The nversty of Olahoma School of Computer Scence orman, OK 739 Guangtao
More informationSingle and multiple stage classifiers implementing logistic discrimination
Sngle and multple stage classfers mplementng logstc dscrmnaton Hélo Radke Bttencourt 1 Dens Alter de Olvera Moraes 2 Vctor Haertel 2 1 Pontfíca Unversdade Católca do Ro Grande do Sul - PUCRS Av. Ipranga,
More informationDesign and Development of a Security Evaluation Platform Based on International Standards
Internatonal Journal of Informatcs Socety, VOL.5, NO.2 (203) 7-80 7 Desgn and Development of a Securty Evaluaton Platform Based on Internatonal Standards Yuj Takahash and Yoshm Teshgawara Graduate School
More informationDynamic Fuzzy Pattern Recognition
Dynamc Fuzzy Pattern Recognton Von der Fakultät für Wrtschaftswssenschaften der Rhensch-Westfälschen Technschen Hochschule Aachen zur Erlangung des akademschen Grades enes Doktors der Wrtschafts- und Sozalwssenschaften
More informationApproaches to Text Mining for Clinical Medical Records
Approaches to Text Mnng for Clncal Medcal Records Xaohua Zhou and Hyol Han College of Informaton Scence and Technology Drexel Unversty Phladelpha, PA 19104 xaohua.zhou@drexel.edu hhan@cs.drexel.edu Isaac
More informationProduct Quality and Safety Incident Information Tracking Based on Web
Product Qualty and Safety Incdent Informaton Trackng Based on Web News 1 Yuexang Yang, 2 Correspondng Author Yyang Wang, 2 Shan Yu, 2 Jng Q, 1 Hual Ca 1 Chna Natonal Insttute of Standardzaton, Beng 100088,
More informationA hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm
Avalable onlne www.ocpr.com Journal of Chemcal and Pharmaceutcal Research, 2014, 6(7):1884-1889 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCPRC5 A hybrd global optmzaton algorthm based on parallel
More informationFast Fuzzy Clustering of Web Page Collections
Fast Fuzzy Clusterng of Web Page Collectons Chrstan Borgelt and Andreas Nürnberger Dept. of Knowledge Processng and Language Engneerng Otto-von-Guercke-Unversty of Magdeburg Unverstätsplatz, D-396 Magdeburg,
More informationA machine vision approach for detecting and inspecting circular parts
A machne vson approach for detectng and nspectng crcular parts Du-Mng Tsa Machne Vson Lab. Department of Industral Engneerng and Management Yuan-Ze Unversty, Chung-L, Tawan, R.O.C. E-mal: edmtsa@saturn.yzu.edu.tw
More informationAn Analysis of Central Processor Scheduling in Multiprogrammed Computer Systems
STAN-CS-73-355 I SU-SE-73-013 An Analyss of Central Processor Schedulng n Multprogrammed Computer Systems (Dgest Edton) by Thomas G. Prce October 1972 Techncal Report No. 57 Reproducton n whole or n part
More informationDEFINING %COMPLETE IN MICROSOFT PROJECT
CelersSystems DEFINING %COMPLETE IN MICROSOFT PROJECT PREPARED BY James E Aksel, PMP, PMI-SP, MVP For Addtonal Informaton about Earned Value Management Systems and reportng, please contact: CelersSystems,
More informationPSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12
14 The Ch-squared dstrbuton PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 1 If a normal varable X, havng mean µ and varance σ, s standardsed, the new varable Z has a mean 0 and varance 1. When ths standardsed
More information320 The Internatonal Arab Journal of Informaton Technology, Vol. 5, No. 3, July 2008 Comparsons Between Data Clusterng Algorthms Osama Abu Abbas Computer Scence Department, Yarmouk Unversty, Jordan Abstract:
More informationHow To Know The Components Of Mean Squared Error Of Herarchcal Estmator S
S C H E D A E I N F O R M A T I C A E VOLUME 0 0 On Mean Squared Error of Herarchcal Estmator Stans law Brodowsk Faculty of Physcs, Astronomy, and Appled Computer Scence, Jagellonan Unversty, Reymonta
More informationCHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol
CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK Sample Stablty Protocol Background The Cholesterol Reference Method Laboratory Network (CRMLN) developed certfcaton protocols for total cholesterol, HDL
More informationCOMPUTER SUPPORT OF SEMANTIC TEXT ANALYSIS OF A TECHNICAL SPECIFICATION ON DESIGNING SOFTWARE. Alla Zaboleeva-Zotova, Yulia Orlova
Internatonal Book Seres "Informaton Scence and Computng" 29 COMPUTE SUPPOT O SEMANTIC TEXT ANALYSIS O A TECHNICAL SPECIICATION ON DESIGNING SOTWAE Alla Zaboleeva-Zotova, Yula Orlova Abstract: The gven
More informationComparison of Domain-Specific Lexicon Construction Methods for Sentiment Analysis
, pp.152-156 http://d.do.org/10.14257/astl.2016.135.38 Comparson of Doman-Specfc Lecon Constructon Methods for Sentment Analyss Myeong So Km 1, Jong Woo Km 2,3 and Cu Jng 4 1 Department of Mathematcs,
More informationTuition Fee Loan application notes
Tuton Fee Loan applcaton notes for new part-tme EU students 2012/13 About these notes These notes should be read along wth your Tuton Fee Loan applcaton form. The notes are splt nto three parts: Part 1
More informationSurvey on Virtual Machine Placement Techniques in Cloud Computing Environment
Survey on Vrtual Machne Placement Technques n Cloud Computng Envronment Rajeev Kumar Gupta and R. K. Paterya Department of Computer Scence & Engneerng, MANIT, Bhopal, Inda ABSTRACT In tradtonal data center
More informationUsing an Adaptive Fuzzy Logic System to Optimise Knowledge Discovery in Proteomics
Usng an Adaptve Fuzzy Logc System to Optmse Knowledge Dscovery n Proteomcs James Malone, Ken McGarry and Chrs Bowerman School of Computng and Technology Sunderland Unversty St. Peter s Way, Sunderland,
More informationRank Based Clustering For Document Retrieval From Biomedical Databases
Jayanth Mancassamy et al /Internatonal Journal on Computer Scence and Engneerng Vol.1(2), 2009, 111-115 Rank Based Clusterng For Document Retreval From Bomedcal Databases Jayanth Mancassamy Department
More informationSemantic Link Analysis for Finding Answer Experts *
JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 28, 51-65 (2012) Semantc Lnk Analyss for Fndng Answer Experts * YAO LU 1,2,3, XIAOJUN QUAN 2, JINGSHENG LEI 4, XINGLIANG NI 1,2,3, WENYIN LIU 2,3 AND YINLONG
More informationBERNSTEIN POLYNOMIALS
On-Lne Geometrc Modelng Notes BERNSTEIN POLYNOMIALS Kenneth I. Joy Vsualzaton and Graphcs Research Group Department of Computer Scence Unversty of Calforna, Davs Overvew Polynomals are ncredbly useful
More informationA DATA MINING APPLICATION IN A STUDENT DATABASE
JOURNAL OF AERONAUTICS AND SPACE TECHNOLOGIES JULY 005 VOLUME NUMBER (53-57) A DATA MINING APPLICATION IN A STUDENT DATABASE Şenol Zafer ERDOĞAN Maltepe Ünversty Faculty of Engneerng Büyükbakkalköy-Istanbul
More informationAn Enhanced Super-Resolution System with Improved Image Registration, Automatic Image Selection, and Image Enhancement
An Enhanced Super-Resoluton System wth Improved Image Regstraton, Automatc Image Selecton, and Image Enhancement Yu-Chuan Kuo ( ), Chen-Yu Chen ( ), and Chou-Shann Fuh ( ) Department of Computer Scence
More informationIT09 - Identity Management Policy
IT09 - Identty Management Polcy Introducton 1 The Unersty needs to manage dentty accounts for all users of the Unersty s electronc systems and ensure that users hae an approprate leel of access to these
More informationAn Efficient Recovery Algorithm for Coverage Hole in WSNs
An Effcent Recover Algorthm for Coverage Hole n WSNs Song Ja 1,*, Wang Balng 1, Peng Xuan 1 School of Informaton an Electrcal Engneerng Harbn Insttute of Technolog at Weha, Shanong, Chna Automatc Test
More informationAn Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services
An Evaluaton of the Extended Logstc, Smple Logstc, and Gompertz Models for Forecastng Short Lfecycle Products and Servces Charles V. Trappey a,1, Hsn-yng Wu b a Professor (Management Scence), Natonal Chao
More informationImproved Mining of Software Complexity Data on Evolutionary Filtered Training Sets
Improved Mnng of Software Complexty Data on Evolutonary Fltered Tranng Sets VILI PODGORELEC Insttute of Informatcs, FERI Unversty of Marbor Smetanova ulca 17, SI-2000 Marbor SLOVENIA vl.podgorelec@un-mb.s
More informationTHE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION
Internatonal Journal of Electronc Busness Management, Vol. 3, No. 4, pp. 30-30 (2005) 30 THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION Yu-Mn Chang *, Yu-Cheh
More informationSet. algorithms based. 1. Introduction. System Diagram. based. Exploration. 2. Index
ISSN (Prnt): 1694-0784 ISSN (Onlne): 1694-0814 www.ijcsi.org 236 IT outsourcng servce provder dynamc evaluaton model and algorthms based on Rough Set L Sh Sh 1,2 1 Internatonal School of Software, Wuhan
More informationWeb Spam Detection Using Machine Learning in Specific Domain Features
Journal of Informaton Assurance and Securty 3 (2008) 220-229 Web Spam Detecton Usng Machne Learnng n Specfc Doman Features Hassan Najadat 1, Ismal Hmed 2 Department of Computer Informaton Systems Faculty
More informationMyINS: A CBR e-commerce Application for Insurance Policies
Proceedngs of the 5th WSEAS Internatonal Conference on E-ACTIVITIES, Vence, Italy, November 20-22, 2006 373 MyINS: A CBR e-commerce Applcaton for Insurance Polces SITI SORAYA ABDUL RAHMAN, AZAH ANIR NORMAN
More informationMining Feature Importance: Applying Evolutionary Algorithms within a Web-based Educational System
Mnng Feature Importance: Applyng Evolutonary Algorthms wthn a Web-based Educatonal System Behrouz MINAEI-BIDGOLI 1, and Gerd KORTEMEYER 2, and Wllam F. PUNCH 1 1 Genetc Algorthms Research and Applcatons
More informationEstimating the Development Effort of Web Projects in Chile
Estmatng the Development Effort of Web Projects n Chle Sergo F. Ochoa Computer Scences Department Unversty of Chle (56 2) 678-4364 sochoa@dcc.uchle.cl M. Cecla Bastarrca Computer Scences Department Unversty
More informationDemographic and Health Surveys Methodology
samplng and household lstng manual Demographc and Health Surveys Methodology Ths document s part of the Demographc and Health Survey s DHS Toolkt of methodology for the MEASURE DHS Phase III project, mplemented
More informationA Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression
Novel Methodology of Workng Captal Management for Large Publc Constructons by Usng Fuzzy S-curve Regresson Cheng-Wu Chen, Morrs H. L. Wang and Tng-Ya Hseh Department of Cvl Engneerng, Natonal Central Unversty,
More informationCausal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting
Causal, Explanatory Forecastng Assumes cause-and-effect relatonshp between system nputs and ts output Forecastng wth Regresson Analyss Rchard S. Barr Inputs System Cause + Effect Relatonshp The job of
More informationStatistical Approach for Offline Handwritten Signature Verification
Journal of Computer Scence 4 (3): 181-185, 2008 ISSN 1549-3636 2008 Scence Publcatons Statstcal Approach for Offlne Handwrtten Sgnature Verfcaton 2 Debnath Bhattacharyya, 1 Samr Kumar Bandyopadhyay, 2
More informationBiometric Signature Processing & Recognition Using Radial Basis Function Network
Bometrc Sgnature Processng & Recognton Usng Radal Bass Functon Network Ankt Chadha, Neha Satam, and Vbha Wal Abstract- Automatc recognton of sgnature s a challengng problem whch has receved much attenton
More informationUsing Content-Based Filtering for Recommendation 1
Usng Content-Based Flterng for Recommendaton 1 Robn van Meteren 1 and Maarten van Someren 2 1 NetlnQ Group, Gerard Brandtstraat 26-28, 1054 JK, Amsterdam, The Netherlands, robn@netlnq.nl 2 Unversty of
More informationAbstract. 260 Business Intelligence Journal July IDENTIFICATION OF DEMAND THROUGH STATISTICAL DISTRIBUTION MODELING FOR IMPROVED DEMAND FORECASTING
260 Busness Intellgence Journal July IDENTIFICATION OF DEMAND THROUGH STATISTICAL DISTRIBUTION MODELING FOR IMPROVED DEMAND FORECASTING Murphy Choy Mchelle L.F. Cheong School of Informaton Systems, Sngapore
More informationDevelopment of an intelligent system for tool wear monitoring applying neural networks
of Achevements n Materals and Manufacturng Engneerng VOLUME 14 ISSUE 1-2 January-February 2006 Development of an ntellgent system for tool wear montorng applyng neural networks A. Antć a, J. Hodolč a,
More informationCluster Analysis of Data Points using Partitioning and Probabilistic Model-based Algorithms
Internatonal Journal of Appled Informaton Systems (IJAIS) ISSN : 2249-0868 Foundaton of Computer Scence FCS, New York, USA Volume 7 No.7, August 2014 www.jas.org Cluster Analyss of Data Ponts usng Parttonng
More informationGraph Calculus: Scalable Shortest Path Analytics for Large Social Graphs through Core Net
Graph Calculus: Scalable Shortest Path Analytcs for Large Socal Graphs through Core Net Lxn Fu Department of Computer Scence Unversty of North Carolna at Greensboro Greensboro, NC, U.S.A. lfu@uncg.edu
More informationFace Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching)
Face Recognton Problem Face Verfcaton Problem Face Verfcaton (1:1 matchng) Querymage face query Face Recognton (1:N matchng) database Applcaton: Access Control www.vsage.com www.vsoncs.com Bometrc Authentcaton
More informationBayesian Network Based Causal Relationship Identification and Funding Success Prediction in P2P Lending
Proceedngs of 2012 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 25 (2012) (2012) IACSIT Press, Sngapore Bayesan Network Based Causal Relatonshp Identfcaton and Fundng Success
More informationImplementation of Deutsch's Algorithm Using Mathcad
Implementaton of Deutsch's Algorthm Usng Mathcad Frank Roux The followng s a Mathcad mplementaton of Davd Deutsch's quantum computer prototype as presented on pages - n "Machnes, Logc and Quantum Physcs"
More informationWeb Object Indexing Using Domain Knowledge *
Web Object Indexng Usng Doman Knowledge * Muyuan Wang Department of Automaton Tsnghua Unversty Bejng 100084, Chna (86-10)51774518 Zhwe L, Le Lu, We-Yng Ma Mcrosoft Research Asa Sgma Center, Hadan Dstrct
More informationVehicle Detection and Tracking in Video from Moving Airborne Platform
Journal of Computatonal Informaton Systems 10: 12 (2014) 4965 4972 Avalable at http://www.jofcs.com Vehcle Detecton and Trackng n Vdeo from Movng Arborne Platform Lye ZHANG 1,2,, Hua WANG 3, L LI 2 1 School
More informationSample Design in TIMSS and PIRLS
Sample Desgn n TIMSS and PIRLS Introducton Marc Joncas Perre Foy TIMSS and PIRLS are desgned to provde vald and relable measurement of trends n student achevement n countres around the world, whle keepng
More informationGender Classification for Real-Time Audience Analysis System
Gender Classfcaton for Real-Tme Audence Analyss System Vladmr Khryashchev, Lev Shmaglt, Andrey Shemyakov, Anton Lebedev Yaroslavl State Unversty Yaroslavl, Russa vhr@yandex.ru, shmaglt_lev@yahoo.com, andrey.shemakov@gmal.com,
More informationStudy on Model of Risks Assessment of Standard Operation in Rural Power Network
Study on Model of Rsks Assessment of Standard Operaton n Rural Power Network Qngj L 1, Tao Yang 2 1 Qngj L, College of Informaton and Electrcal Engneerng, Shenyang Agrculture Unversty, Shenyang 110866,
More informationLearning from Large Distributed Data: A Scaling Down Sampling Scheme for Efficient Data Processing
Internatonal Journal of Machne Learnng and Computng, Vol. 4, No. 3, June 04 Learnng from Large Dstrbuted Data: A Scalng Down Samplng Scheme for Effcent Data Processng Che Ngufor and Janusz Wojtusak part
More informationHowHow to Find the Best Online Stock Broker
A GENERAL APPROACH FOR SECURITY MONITORING AND PREVENTIVE CONTROL OF NETWORKS WITH LARGE WIND POWER PRODUCTION Helena Vasconcelos INESC Porto hvasconcelos@nescportopt J N Fdalgo INESC Porto and FEUP jfdalgo@nescportopt
More informationTraffic-light a stress test for life insurance provisions
MEMORANDUM Date 006-09-7 Authors Bengt von Bahr, Göran Ronge Traffc-lght a stress test for lfe nsurance provsons Fnansnspetonen P.O. Box 6750 SE-113 85 Stocholm [Sveavägen 167] Tel +46 8 787 80 00 Fax
More information