4. GPCRs PREDICTION USING GREY INCIDENCE DEGREE MEASURE AND PRINCIPAL COMPONENT ANALYIS


 Rodger Gardner
 2 years ago
 Views:
Transcription
1 4. GPCRs PREDICTION USING GREY INCIDENCE DEGREE MEASURE AND PRINCIPAL COMPONENT ANALYIS The GPCRs sequences are made up of amino acid polypeptide chains. We can also call them sub units. The number and arrangements of these sub units forming a GPCR sequence is called quaternary structure. There are different types of quaternary structures in GPCRs, such as: dimmer, monomer, tetramer, trimer and pentamer. Some biological processes are directly affected by quaternary structures. For example, monomers form sodium channels (Chen, Alcayaga, SuarezIsla, ORourke, Tomaselli, & Marban, 2002), homotetramers form potassium channel (Doyle, et al., 1998), homopentamers make phospholamban channels (Oxenoid & Chou, 2005), (Oxenoid, Rice, & Chou, 2007)and heteropentamers make α7 nicotinic acetylcholine receptor (Chou, 2004).Some transitions only occur in tetramers, dimmers bind some of ligands and tetramers make some ion channels. In this method, we have again classified GPCRs into three levels as in chapter 3. We have hybridized3 feature extraction approaches i.e. Split amino acid composition (SAAC), Pseudo amino acid (PseAA) composition and Fast Fourier transform (FFT). We have employed two physiochemical properties i.e. Electronic and Bulk in PseAA, which are already explained in chapter 3. All of these feature extraction strategies are explained in chapter 2. The number of features taken in PseAA is 62, in SAAC are 60 and in FFT is256. Total number of features is 378. As the number of features after the hybridization becomes so high and to avoid curse of dimensionality, we have applied principal component analysis (PCA) is used to reduce the features. After applying PCA, size of feature vector is reduced to 180.For the sake of classification we have used nearest neighbor algorithm. We have computed the nearest neighbors of a test sequence in two ways i.e. grey incidence degree measure and Euclidian distance measure. The grey incidence degree measure is performing better than Euclidian distance. We have trained and tested our methods on D8354 and compared with other methods on datasets: D167 and D566. Over of chapter is shown in the Figure
2 Figure 41: Overview of chapter GREY INCIDENCE DEGREE MEASURE Deng introduced grey theory in 1982 to analyze the uncertainty of a system (Deng, 1982). This theory can be applicable to the problems in which information is fuzzy or uncertain. Grey incidence degree (GID)measure is one of the major components of this theory (Liu, Fang, & Lin, 2005).The classification of GPCRs is also a fuzzy problem. Some GPCR sequences can be put into one class based on some properties but they can also be put in another class because of some other properties. where T T, T,..., Tn 1 2 T T1, T2,..., Tn 4.1 Tk, Tk ti, t i Min Max k Max are the numeric forms of n training sequences and T t is the test sequence. is t j the grey relational coefficient. Min Min j Mink Pk Pk t j t, i t i Max j k k k k k k 4.2, Max Max P P, P P, j1,2,..., nare the indices of training sequences, k 1,2,...,180 are indices of features of a GPCR sequence and = distinguishing coefficient. The value of distinguishing coefficient is between 0 and 1. 67
3 The grey incidence degree O of the test sequence with training sequences is a weighted sum of grey relational coefficient and is given by the following equation. 180 t i t, i k k, k O G G W G G k1 where,w k is weight associated with each feature. Wehave given equal weight to each feature and taken the value of ξ equal to 0.5 as in existing work (Tsai, Liou, & Jiang, 2005), (Xiao, Wang, & Chou, 2009). The grey incidence degree G t and the training sequences G i O G, G t i 4.3 is the correlation between the test sequence. A training sequence closest to the test sequence will have high grey incidence degree measure higher than other training sequences and hence can annotate the test sequence to its class. In this method, we have employed GID in Nearest Neighbor algorithm to compute the neighbors of a test sequence, which further can help to annotate the test sequence. 4.2.PRINCIPAL COMPONENT ANALYSIS Principal component analysis (PCA) is a useful technique in pattern classification or machine learning to analyze patterns in a high dimensional data and to prominent differences and the similarities in the data. It transforms high dimensional data into very low dimension without the loss of significant information. PCA is used in many different fields from neuroscience to computer graphics because it is nonparametric method used to extract useful relevant information from confusing data sets. The mathematically description of PCA is summarized in sections given below. The mathematical details of PCA are explained in detail in (Howard, 2000). Let us suppose a multidimensional data. We first compute the mean across each dimension and subtract mean from each value of that dimension, the data has now mean value equal to zero. Then we calculate the covariance matrix of zero mean data. Covariance matrix shows the relation between different dimensions in high dimensional data. Covariance can only be measured for data of more than 2 dimensions. Covariance matrix is N x N matrix, where N is number of dimensions of data. Covariance of one dimension to itself is equal to variance of that dimension COV X, Y n X i X Yi Y i1 4.4 n 1 68
4 where, COV X, Y is covariance between X andy dimensions. X is the mean of X dimension and Y is the mean of X dimension and n is the number of data points. Next, we have to compute the Eigen values and Eigen vector of the covariance matrix and sort Eigen vectors according to Eigen values. Next, we will ignore some of less important Eigen vectors to reduce dimensionality of the data. Finally, multiply the transpose of the chosen Eigen vector to the original high dimensional data and use this data as features to classification algorithm. We have named the GID based method as: GPCRGID (Rehman & Khan, 2011). The overview of GPCRGID is shown in Figure 42. Figure 42: Overview of GPCRGID 4.3.RESULTS AND DESCUSSIONS As explained in start of this chapter, we have trained and tested our methods on D8354. The GPCRs in this dataset are classified into three levels i.e. family, sub family and subsub family 69
5 levels. In this proposed method, we have used only accuracy measure for performance assessment. Following sections gives the details of the results Family level classification GPCRs are classified into five families. The percentage accuracy of GID based method is 97.82% and Euclidian distance based method has achieved 97.44% Sub family level classification The five families of GPCRs are further classified into 40 sub families at this level. The percentage accuracy of GID based method is 81.55% and Euclidian distance based method is 80.97% Subsub family level classification The 40 sub families of GPCRs are further classified into 108 subsub families at this level. The percentage accuracy of GID based method is 73.32% and Euclidian distance based method is 72.66%.The performance of both methods is also shown in Figure 43. Figure 43: Performance of GID and Euclidian distance methods Figure 43clearly shows that the performance of GPCRGID is superior than Euclidian distance based method at all the three levels. Hence, we have compared GPCRGID with other existing methods. 70
6 Comparison with other methods We have trained our method on D8354 dataset and compared it with other methods using D8354. We have also compared our method with existing methods using D167 and D566 datasets. D167 and D566 are already explained in chapter 2. The comparison details are as follows Comparison with Selective top down approach In the selective top down approach, GPCRs are hierarchically classified into 3 levels (Davies, Secker, Freitas, Mendao, Timmis, & Flower, 2007). The selective top down method has assessed their performance using accuracy measure so we have compared our accuracy with them as shown in Figure 44. Figure 44: Comparison with selective top down approach At family level, the best percentage accuracy achieved in selective top down approach is 95.87%, while accuracy achieved in GPCRGID is 97.82%. At sub family level, the best accuracy achieved in selective top down approach is 80.77% while accuracy achieved in GPCRGID 81.55%. Selective top down approach has achieved 69.98% accuracy at subsub family level, while accuracy achieved in GPCRGID is 73.32%. At all the three levels of GPCRs, GPCRGID is significantly superior to the selective top down approach and hence strengthening the worth of GPCRGID. 71
7 Comparison with other existing methods on D167 and D566 datasets There are 6 existing methods with whom we have compared GPCRGID on D167 dataset i.e. (Elrod & Chou, 2002), (Huang, Cai, Ji, & Li, 2004), (Bhasin & Raghava, 2005), (Gao & Wang, 2006), (Gao, Wu, Ma, Lu, & He, 2008) and PCAGPCR (Peng, Yang, & Chen, 2010 ). Again, we have used accuracy measure for the sake of comparison. This comparison is shown in Figure 45, which clearly shows the superiority of GPCRGID over all of the 6 methods. Figure 45: Comparison on D167 There are 2 methods with which we have compared GPCRGID on D566. One is PCAGPCR (Peng, Yang, & Chen, 2010 )and the other is by Chou (Chou & Elrod, 2002). The percentage accuracy achieved PCAGPCR is 97.88% and in (Chou & Elrod, 2002) is 92.05%, where as the accuracy achieved in GPCRGID is 97.96%. 72
8 Figure 46: Comparison on D566 Figure 46shows the superiority of GPCRGID over PCAGPCR and Chou s method (Chou & Elrod, 2002). This improvement in performance of GPCRGID is because of several reasons. One reason is the hybridization of spatial domain and transformed domain features and applying PCA for feature reduction. Secondly, GID measure based method can efficiently discriminate classes by computing quaternary structure of GPCR numerically. 73
BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES
BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 123 CHAPTER 7 BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 7.1 Introduction Even though using SVM presents
More informationData Clustering. Dec 2nd, 2013 Kyrylo Bessonov
Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms kmeans Hierarchical Main
More informationAlgorithm and computational complexity of Insulin
Algorithm and computational complexity Insulin Lutvo Kurić Bosnia and Herzegovina, Novi Travnik, Kalinska 7 Abstract:This paper discusses cyberinformation studies the amino acid composition insulin, in
More informationFace Recognition using Principle Component Analysis
Face Recognition using Principle Component Analysis Kyungnam Kim Department of Computer Science University of Maryland, College Park MD 20742, USA Summary This is the summary of the basic idea about PCA
More informationComparison of Nonlinear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data
CMPE 59H Comparison of Nonlinear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Nonlinear
More informationImproved Fuzzy Cmeans Clustering Algorithm Based on Cluster Density
Journal of Computational Information Systems 8: 2 (2012) 727 737 Available at http://www.jofcis.com Improved Fuzzy Cmeans Clustering Algorithm Based on Cluster Density Xiaojun LOU, Junying LI, Haitao
More informationIntroduction to Principal Components and FactorAnalysis
Introduction to Principal Components and FactorAnalysis Multivariate Analysis often starts out with data involving a substantial number of correlated variables. Principal Component Analysis (PCA) is a
More informationPCA to Eigenfaces. CS 510 Lecture #16 March 23 th A 9 dimensional PCA example
PCA to Eigenfaces CS 510 Lecture #16 March 23 th 2015 A 9 dimensional PCA example is dark around the edges and bright in the middle. is light with dark vertical bars. is light with dark horizontal bars.
More informationAdaptive Face Recognition System from Myanmar NRC Card
Adaptive Face Recognition System from Myanmar NRC Card Ei Phyo Wai University of Computer Studies, Yangon, Myanmar Myint Myint Sein University of Computer Studies, Yangon, Myanmar ABSTRACT Biometrics is
More informationAUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.
AUTOMATION OF ENERGY DEMAND FORECASTING by Sanzad Siddique, B.S. A Thesis submitted to the Faculty of the Graduate School, Marquette University, in Partial Fulfillment of the Requirements for the Degree
More informationSTATISTICS AND DATA ANALYSIS IN GEOLOGY, 3rd ed. Clarificationof zonationprocedure described onpp. 238239
STATISTICS AND DATA ANALYSIS IN GEOLOGY, 3rd ed. by John C. Davis Clarificationof zonationprocedure described onpp. 3839 Because the notation used in this section (Eqs. 4.8 through 4.84) is inconsistent
More informationPCA, Clustering and Classification. By H. Bjørn Nielsen strongly inspired by Agnieszka S. Juncker
PCA, Clustering and Classification By H. Bjørn Nielsen strongly inspired by Agnieszka S. Juncker Motivation: Multidimensional data Pat1 Pat2 Pat3 Pat4 Pat5 Pat6 Pat7 Pat8 Pat9 209619_at 7758 4705 5342
More informationClassifiers & Classification
Classifiers & Classification Forsyth & Ponce Computer Vision A Modern Approach chapter 22 Pattern Classification Duda, Hart and Stork School of Computer Science & Statistics Trinity College Dublin Dublin
More informationPrincipal Component Analysis
Principal Component Analysis ERS70D George Fernandez INTRODUCTION Analysis of multivariate data plays a key role in data analysis. Multivariate data consists of many different attributes or variables recorded
More informationObject Recognition and Template Matching
Object Recognition and Template Matching Template Matching A template is a small image (subimage) The goal is to find occurrences of this template in a larger image That is, you want to find matches of
More information15.062 Data Mining: Algorithms and Applications Matrix Math Review
.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop
More informationPalmprint as a Biometric Identifier
Palmprint as a Biometric Identifier 1 Kasturika B. Ray, 2 Rachita Misra 1 Orissa Engineering College, Nabojyoti Vihar, Bhubaneswar, Orissa, India 2 Dept. Of IT, CV Raman College of Engineering, Bhubaneswar,
More informationVolume 2, Issue 9, September 2014 International Journal of Advance Research in Computer Science and Management Studies
Volume 2, Issue 9, September 2014 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at: www.ijarcsms.com
More informationClustering. Danilo Croce Web Mining & Retrieval a.a. 2015/201 16/03/2016
Clustering Danilo Croce Web Mining & Retrieval a.a. 2015/201 16/03/2016 1 Supervised learning vs. unsupervised learning Supervised learning: discover patterns in the data that relate data attributes with
More informationU.P.B. Sci. Bull., Series C, Vol. 77, Iss. 1, 2015 ISSN 2286 3540
U.P.B. Sci. Bull., Series C, Vol. 77, Iss. 1, 2015 ISSN 2286 3540 ENTERPRISE FINANCIAL DISTRESS PREDICTION BASED ON BACKWARD PROPAGATION NEURAL NETWORK: AN EMPIRICAL STUDY ON THE CHINESE LISTED EQUIPMENT
More informationEnvironmental Remote Sensing GEOG 2021
Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class
More informationFace Recognition using SIFT Features
Face Recognition using SIFT Features Mohamed Aly CNS186 Term Project Winter 2006 Abstract Face recognition has many important practical applications, like surveillance and access control.
More informationMedical Information Management & Mining. You Chen Jan,15, 2013 You.chen@vanderbilt.edu
Medical Information Management & Mining You Chen Jan,15, 2013 You.chen@vanderbilt.edu 1 Trees Building Materials Trees cannot be used to build a house directly. How can we transform trees to building materials?
More informationClassspecific Sparse Coding for Learning of Object Representations
Classspecific Sparse Coding for Learning of Object Representations Stephan Hasler, Heiko Wersing, and Edgar Körner Honda Research Institute Europe GmbH CarlLegienStr. 30, 63073 Offenbach am Main, Germany
More informationBiometric Authentication using Online Signatures
Biometric Authentication using Online Signatures Alisher Kholmatov and Berrin Yanikoglu alisher@su.sabanciuniv.edu, berrin@sabanciuniv.edu http://fens.sabanciuniv.edu Sabanci University, Tuzla, Istanbul,
More informationClustering UE 141 Spring 2013
Clustering UE 141 Spring 013 Jing Gao SUNY Buffalo 1 Definition of Clustering Finding groups of obects such that the obects in a group will be similar (or related) to one another and different from (or
More informationNonnegative Matrix Factorization (NMF) in Semisupervised Learning Reducing Dimension and Maintaining Meaning
Nonnegative Matrix Factorization (NMF) in Semisupervised Learning Reducing Dimension and Maintaining Meaning SAMSI 10 May 2013 Outline Introduction to NMF Applications Motivations NMF as a middle step
More informationClustering and Data Mining in R
Clustering and Data Mining in R Workshop Supplement Thomas Girke December 10, 2011 Introduction Data Preprocessing Data Transformations Distance Methods Cluster Linkage Hierarchical Clustering Approaches
More informationDenial of Service Attack Detection Using Multivariate Correlation Information and Support Vector Machine Classification
International Journal of Computer Sciences and Engineering Open Access Research Paper Volume4, Issue3 EISSN: 23472693 Denial of Service Attack Detection Using Multivariate Correlation Information and
More informationFUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM
International Journal of Innovative Computing, Information and Control ICIC International c 0 ISSN 3448 Volume 8, Number 8, August 0 pp. 4 FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT
More informationDATA ANALYTICS USING R
DATA ANALYTICS USING R Duration: 90 Hours Intended audience and scope: The course is targeted at fresh engineers, practicing engineers and scientists who are interested in learning and understanding data
More informationMachine Learning Logistic Regression
Machine Learning Logistic Regression Jeff Howbert Introduction to Machine Learning Winter 2012 1 Logistic regression Name is somewhat misleading. Really a technique for classification, not regression.
More informationAnalysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j
Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j What is Kiva? An organization that allows people to lend small amounts of money via the Internet
More informationStudy on Human Performance Reliability in Green Construction Engineering
Study on Human Performance Reliability in Green Construction Engineering Xiaoping Bai a, Cheng Qian b School of management, Xi an University of Architecture and Technology, Xi an 710055, China a xxpp8899@126.com,
More informationA Survey on Outlier Detection Techniques for Credit Card Fraud Detection
IOSR Journal of Computer Engineering (IOSRJCE) eissn: 22780661, p ISSN: 22788727Volume 16, Issue 2, Ver. VI (MarApr. 2014), PP 4448 A Survey on Outlier Detection Techniques for Credit Card Fraud
More informationFinal Project Report
CPSC545 by Introduction to Data Mining Prof. Martin Schultz & Prof. Mark Gerstein Student Name: Yu Kor Hugo Lam Student ID : 904907866 Due Date : May 7, 2007 Introduction Final Project Report Pseudogenes
More informationMathematical Model Based Total Security System with Qualitative and Quantitative Data of Human
Int Jr of Mathematics Sciences & Applications Vol3, No1, JanuaryJune 2013 Copyright Mind Reader Publications ISSN No: 22309888 wwwjournalshubcom Mathematical Model Based Total Security System with Qualitative
More informationClassification algorithm in Data mining: An Overview
Classification algorithm in Data mining: An Overview S.Neelamegam #1, Dr.E.Ramaraj *2 #1 M.phil Scholar, Department of Computer Science and Engineering, Alagappa University, Karaikudi. *2 Professor, Department
More informationName: Date: Adding Zero. Addition. Worksheet A
A DIVISION OF + + + + + Adding Zero + + + + + + + + + + + + + + + Addition Worksheet A + + + + + Adding Zero + + + + + + + + + + + + + + + Addition Worksheet B + + + + + Adding Zero + + + + + + + + + +
More informationPrincipal components analysis
CS229 Lecture notes Andrew Ng Part XI Principal components analysis In our discussion of factor analysis, we gave a way to model data x R n as approximately lying in some kdimension subspace, where k
More informationPrincipal Components Analysis (PCA)
Principal Components Analysis (PCA) Janette Walde janette.walde@uibk.ac.at Department of Statistics University of Innsbruck Outline I Introduction Idea of PCA Principle of the Method Decomposing an Association
More informationLinear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
More informationTHREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC TERISTICS
THREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC TERISTICS O.U. Sezerman 1, R. Islamaj 2, E. Alpaydin 2 1 Laborotory of Computational Biology, Sabancı University, Istanbul, Turkey. 2 Computer Engineering
More informationData, Measurements, Features
Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are
More informationStatistical Models in Data Mining
Statistical Models in Data Mining Sargur N. Srihari University at Buffalo The State University of New York Department of Computer Science and Engineering Department of Biostatistics 1 Srihari Flood of
More informationAlignment and Preprocessing for Data Analysis
Alignment and Preprocessing for Data Analysis Preprocessing tools for chromatography Basics of alignment GC FID (D) data and issues PCA F Ratios GC MS (D) data and issues PCA F Ratios PARAFAC Piecewise
More informationBayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com
Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University caizhua@gmail.com 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian
More informationData Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining
Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 4/8/2004 Hierarchical
More informationOptimal PID Controller Design for AVR System
Tamkang Journal of Science and Engineering, Vol. 2, No. 3, pp. 259 270 (2009) 259 Optimal PID Controller Design for AVR System ChingChang Wong*, ShihAn Li and HouYi Wang Department of Electrical Engineering,
More informationPrice Prediction of Share Market using Artificial Neural Network (ANN)
Prediction of Share Market using Artificial Neural Network (ANN) Zabir Haider Khan Department of CSE, SUST, Sylhet, Bangladesh Tasnim Sharmin Alin Department of CSE, SUST, Sylhet, Bangladesh Md. Akter
More informationIntroduction to machine learning and pattern recognition Lecture 1 Coryn BailerJones
Introduction to machine learning and pattern recognition Lecture 1 Coryn BailerJones http://www.mpia.de/homes/calj/mlpr_mpia2008.html 1 1 What is machine learning? Data description and interpretation
More informationClassification of Household Devices by Electricity Usage Profiles
Classification of Household Devices by Electricity Usage Profiles Jason Lines 1, Anthony Bagnall 1, Patrick CaigerSmith 2, and Simon Anderson 2 1 School of Computing Sciences University of East Anglia
More informationPerformance Analysis of Data Mining Techniques for Improving the Accuracy of Wind Power Forecast Combination
Performance Analysis of Data Mining Techniques for Improving the Accuracy of Wind Power Forecast Combination Ceyda Er Koksoy 1, Mehmet Baris Ozkan 1, Dilek Küçük 1 Abdullah Bestil 1, Sena Sonmez 1, Serkan
More informationT61.3050 : Email Classification as Spam or Ham using Naive Bayes Classifier. Santosh Tirunagari : 245577
T61.3050 : Email Classification as Spam or Ham using Naive Bayes Classifier Santosh Tirunagari : 245577 January 20, 2011 Abstract This term project gives a solution how to classify an email as spam or
More informationStatistical Machine Learning from Data
Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Gaussian Mixture Models Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique
More informationDesign call center management system of ecommerce based on BP neural network and multifractal
Available online www.jocpr.com Journal of Chemical and Pharmaceutical Research, 2014, 6(6):951956 Research Article ISSN : 09757384 CODEN(USA) : JCPRC5 Design call center management system of ecommerce
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationFuzzy Based Defect Detection in Printed Circuit Boards
Volume 1, Number 1, October 2014 SOP TRANSACTIONS ON SIGNAL PROCESSING Fuzzy Based Defect Detection in Printed Circuit Boards Neha koul *, Gurmeet kaur, Beant kaur Department of Electronics And Communication
More informationDATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS
DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS 1 AND ALGORITHMS Chiara Renso KDDLAB ISTI CNR, Pisa, Italy WHAT IS CLUSTER ANALYSIS? Finding groups of objects such that the objects in a group will be similar
More informationDATA MINING USING INTEGRATION OF CLUSTERING AND DECISION TREE
DATA MINING USING INTEGRATION OF CLUSTERING AND DECISION TREE 1 K.Murugan, 2 P.Varalakshmi, 3 R.Nandha Kumar, 4 S.Boobalan 1 Teaching Fellow, Department of Computer Technology, Anna University 2 Assistant
More informationDemand Forecasting Optimization in Supply Chain
2011 International Conference on Information Management and Engineering (ICIME 2011) IPCSIT vol. 52 (2012) (2012) IACSIT Press, Singapore DOI: 10.7763/IPCSIT.2012.V52.12 Demand Forecasting Optimization
More information4.3 Least Squares Approximations
18 Chapter. Orthogonality.3 Least Squares Approximations It often happens that Ax D b has no solution. The usual reason is: too many equations. The matrix has more rows than columns. There are more equations
More informationClustering & Association
Clustering  Overview What is cluster analysis? Grouping data objects based only on information found in the data describing these objects and their relationships Maximize the similarity within objects
More informationEvaluation of Feature Selection Methods for Predictive Modeling Using Neural Networks in Credits Scoring
714 Evaluation of Feature election Methods for Predictive Modeling Using Neural Networks in Credits coring Raghavendra B. K. Dr. M.G.R. Educational and Research Institute, Chennai95 Email: raghavendra_bk@rediffmail.com
More informationAssessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall
Automatic Photo Quality Assessment Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Estimating i the photorealism of images: Distinguishing i i paintings from photographs h Florin
More informationAnalysis of Landsat ETM+ Image Enhancement for Lithological Classification Improvement in Eagle Plain Area, Northern Yukon
Analysis of Landsat ETM+ Image Enhancement for Lithological Classification Improvement in Eagle Plain Area, Northern Yukon Shihua Zhao, Department of Geology, University of Calgary, zhaosh@ucalgary.ca,
More informationMACHINE LEARNING IN HIGH ENERGY PHYSICS
MACHINE LEARNING IN HIGH ENERGY PHYSICS LECTURE #1 Alex Rogozhnikov, 2015 INTRO NOTES 4 days two lectures, two practice seminars every day this is introductory track to machine learning kaggle competition!
More informationDoptimal plans in observational studies
Doptimal plans in observational studies Constanze Pumplün Stefan Rüping Katharina Morik Claus Weihs October 11, 2005 Abstract This paper investigates the use of Design of Experiments in observational
More informationChurn problem in retail banking Current methods in churn prediction models Fuzzy cmeans clustering algorithm vs. classical kmeans clustering
CHURN PREDICTION MODEL IN RETAIL BANKING USING FUZZY C MEANS CLUSTERING Džulijana Popović Consumer Finance, Zagrebačka banka d.d. Bojana Dalbelo Bašić Faculty of Electrical Engineering and Computing University
More informationSVM Ensemble Model for Investment Prediction
19 SVM Ensemble Model for Investment Prediction Chandra J, Assistant Professor, Department of Computer Science, Christ University, Bangalore Siji T. Mathew, Research Scholar, Christ University, Dept of
More informationA FUZZY BASED APPROACH TO TEXT MINING AND DOCUMENT CLUSTERING
A FUZZY BASED APPROACH TO TEXT MINING AND DOCUMENT CLUSTERING Sumit Goswami 1 and Mayank Singh Shishodia 2 1 Indian Institute of TechnologyKharagpur, Kharagpur, India sumit_13@yahoo.com 2 School of Computer
More informationConcepts in Machine Learning, Unsupervised Learning & Astronomy Applications
Data Mining In Modern Astronomy Sky Surveys: Concepts in Machine Learning, Unsupervised Learning & Astronomy Applications ChingWa Yip cwyip@pha.jhu.edu; Bloomberg 518 Human are Great Pattern Recognizers
More informationBig Data Classification: Problems and Challenges in Network Intrusion Prediction with Machine Learning
Big Data Classification: Problems and Challenges in Network Intrusion Prediction with Machine Learning By: Shan Suthaharan Suthaharan, S. (2014). Big data classification: Problems and challenges in network
More informationIntroduction to Support Vector Machines. Colin Campbell, Bristol University
Introduction to Support Vector Machines Colin Campbell, Bristol University 1 Outline of talk. Part 1. An Introduction to SVMs 1.1. SVMs for binary classification. 1.2. Soft margins and multiclass classification.
More informationTracking and Recognition in Sports Videos
Tracking and Recognition in Sports Videos Mustafa Teke a, Masoud Sattari b a Graduate School of Informatics, Middle East Technical University, Ankara, Turkey mustafa.teke@gmail.com b Department of Computer
More informationAnalysis of Model and Key Technology for P2P Network Route Security Evaluation with 2tuple Linguistic Information
Journal of Computational Information Systems 9: 14 2013 5529 5534 Available at http://www.jofcis.com Analysis of Model and Key Technology for P2P Network Route Security Evaluation with 2tuple Linguistic
More informationManifold Learning Examples PCA, LLE and ISOMAP
Manifold Learning Examples PCA, LLE and ISOMAP Dan Ventura October 14, 28 Abstract We try to give a helpful concrete example that demonstrates how to use PCA, LLE and Isomap, attempts to provide some intuition
More informationContentBased Recommendation
ContentBased Recommendation Contentbased? Item descriptions to identify items that are of particular interest to the user Example Example Comparing with Noncontent based Items Userbased CF Searches
More informationAUTOMATIC THEFT SECURITY SYSTEM (SMART SURVEILLANCE CAMERA)
AUTOMATIC THEFT SECURITY SYSTEM (SMART SURVEILLANCE CAMERA) Veena G.S 1, Chandrika Prasad 2 and Khaleel K 3 Department of Computer Science and Engineering, M.S.R.I.T,Bangalore, Karnataka veenags@msrit.edu
More informationDatabase Modeling and Visualization Simulation technology Based on Java3D Hongxia Liu
International Conference on Information Sciences, Machinery, Materials and Energy (ICISMME 05) Database Modeling and Visualization Simulation technology Based on Java3D Hongxia Liu Department of Electronic
More informationAn Introduction to Machine Learning
An Introduction to Machine Learning L5: Novelty Detection and Regression Alexander J. Smola Statistical Machine Learning Program Canberra, ACT 0200 Australia Alex.Smola@nicta.com.au Tata Institute, Pune,
More informationSupervised and unsupervised learning  1
Chapter 3 Supervised and unsupervised learning  1 3.1 Introduction The science of learning plays a key role in the field of statistics, data mining, artificial intelligence, intersecting with areas in
More informationStock price prediction using genetic algorithms and evolution strategies
Stock price prediction using genetic algorithms and evolution strategies Ganesh Bonde Institute of Artificial Intelligence University Of Georgia Athens,GA30601 Email: ganesh84@uga.edu Rasheed Khaled Institute
More informationStandardization and Its Effects on KMeans Clustering Algorithm
Research Journal of Applied Sciences, Engineering and Technology 6(7): 3993303, 03 ISSN: 0407459; eissn: 0407467 Maxwell Scientific Organization, 03 Submitted: January 3, 03 Accepted: February 5, 03
More informationDimensionality Reduction: Principal Components Analysis
Dimensionality Reduction: Principal Components Analysis In data mining one often encounters situations where there are a large number of variables in the database. In such situations it is very likely
More informationClustering. Data Mining. Abraham Otero. Data Mining. Agenda
Clustering 1/46 Agenda Introduction Distance Knearest neighbors Hierarchical clustering Quick reference 2/46 1 Introduction It seems logical that in a new situation we should act in a similar way as in
More informationNeural Networks for Sentiment Detection in Financial Text
Neural Networks for Sentiment Detection in Financial Text Caslav Bozic* and Detlef Seese* With a rise of algorithmic trading volume in recent years, the need for automatic analysis of financial news emerged.
More informationJava Modules for Time Series Analysis
Java Modules for Time Series Analysis Agenda Clustering Nonnormal distributions Multifactor modeling Implied ratings Time series prediction 1. Clustering + Cluster 1 Synthetic Clustering + Time series
More informationStatistics Graduate Courses
Statistics Graduate Courses STAT 7002Topics in StatisticsBiological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
More informationTtest & factor analysis
Parametric tests Ttest & factor analysis Better than non parametric tests Stringent assumptions More strings attached Assumes population distribution of sample is normal Major problem Alternatives Continue
More informationClustering Methods in Data Mining with its Applications in High Education
2012 International Conference on Education Technology and Computer (ICETC2012) IPCSIT vol.43 (2012) (2012) IACSIT Press, Singapore Clustering Methods in Data Mining with its Applications in High Education
More informationLecture 7 Cogsci 109. Thurs. Oct. 12, 2006
Lecture 7 Cogsci 109 Thurs. Oct. 12, 2006 Announcements Homework 2 is posted. Office hours Midterm is coming up somewhere in the next couple of weeks Anything lectured on, presented in section, in the
More informationCharacteristics and statistics of digital remote sensing imagery
Characteristics and statistics of digital remote sensing imagery There are two fundamental ways to obtain digital imagery: Acquire remotely sensed imagery in an analog format (often referred to as hardcopy)
More informationA new Approach for Intrusion Detection in Computer Networks Using Data Mining Technique
A new Approach for Intrusion Detection in Computer Networks Using Data Mining Technique Aida Parbaleh 1, Dr. Heirsh Soltanpanah 2* 1 Department of Computer Engineering, Islamic Azad University, Sanandaj
More informationBIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376
Course Director: Dr. Kayvan Najarian (DCM&B, kayvan@umich.edu) Lectures: Labs: Mondays and Wednesdays 9:00 AM 10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.
More informationData Mining Cluster Analysis: Basic Concepts and Algorithms. Clustering Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining
Data Mining Cluster Analsis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 Introduction to Data Mining b Tan, Steinbach, Kumar Clustering Algorithms Kmeans and its variants Hierarchical clustering
More informationIndex Terms: Face Recognition, Face Detection, Monitoring, Attendance System, and System Access Control.
Modern Technique Of Lecture Attendance Using Face Recognition. Shreya Nallawar, Neha Giri, Neeraj Deshbhratar, Shamal Sane, Trupti Gautre, Avinash Bansod Bapurao Deshmukh College Of Engineering, Sewagram,
More informationData Mining Analysis of HIV1 Protease Crystal Structures
Data Mining Analysis of HIV1 Protease Crystal Structures Gene M. Ko, A. Srinivas Reddy, Sunil Kumar, and Rajni Garg AP0907 09 Data Mining Analysis of HIV1 Protease Crystal Structures Gene M. Ko 1, A.
More information1 Introduction to Matrices
1 Introduction to Matrices In this section, important definitions and results from matrix algebra that are useful in regression analysis are introduced. While all statements below regarding the columns
More informationSupport Vector Machines with Clustering for Training with Very Large Datasets
Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France theodoros.evgeniou@insead.fr Massimiliano
More information