Del Análisis de Conceptos Formales al co-clustering idempotente
|
|
- Vernon Hunt
- 8 years ago
- Views:
Transcription
1 Del Análisis de Conceptos Formales al co-clustering idempotente Francisco J. Valverde-Albacete Dep. Lenguajes y Sistemas Informáticos NLP & IR group, UNED, Spain 02/04/2013, Seminario MAVIR, Madrid, Spain F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
2 Outline 1 Motivation Co-clustering as a DM task A model of batch ad-hoc retrieval Biclustering in IR 2 The basics of Formal Concept Analysis Definitions The Concept Lattice 3 The KFCA analysis of Confusion Matrices Representations of Confusion Matrices R min,+ -FCA of Confusion Matrices 4 Discussion and conclusions F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
3 Biclustering, coclustering: a definition Given: a set of samples (or objects or observations, etc.) G, with G = g, a set of features (or attributes, etc.) M, with M = m, and a data matrix R K g m, where K is generally any non-negative section of a field, say R + 0, Direct clustering[hartigan, 1972]: generate permutations for rows I and columns J... so that R(I, J) is block diagonal. More generally[mirkin, 1996], generate: biclusters, that is pairs (A, B) of sets of samples A G and features B M... that are naturally related to each other. F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
4 Biclustering definition (II) Different models of what a matrix is generate different concepts of natural relations and algorithms As a contingency matrix, find a non-negative factorization minimizing the reconstruction loss. Iterative (direct clustering) techniques Non-negative matrix factorization techniques As bipartite (weighted) graph, maximize/minimize measure on a cut Graph-partitioning techniques Spectral coclustering techniques As a product of RV s, minimize loss of mutual information in coclustering taken as a compression of the joint distribution. Information-theoretic techniques F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
5 Outline 1 Motivation Co-clustering as a DM task A model of batch ad-hoc retrieval Biclustering in IR 2 The basics of Formal Concept Analysis Definitions The Concept Lattice 3 The KFCA analysis of Confusion Matrices Representations of Confusion Matrices R min,+ -FCA of Confusion Matrices 4 Discussion and conclusions F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
6 A model for batch ad-hoc tasks [Fuhr, 1992] Q R D α Q Q R β Q Q R α D β D D D Figure: An adaptation of the conceptual model of Fuhr. Given D, Q and R, the ideal IR system is S D,Q (R) =< ϱ R >... with a relevance function ϱ R : Q 2 D (1) q i ϱ R (q i ) = {d j D d j Rq i }. F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
7 An IR model solving the batch ad-hoc task Given a collection, DT D, a set of topics, QT Q, and a set of relevance judgments, RT D T Q T, the implemented IR system S D,Q ( ˆR) =< ϱˆr > is what we can actually build, with approximated relevance ˆR R using a retrieval function: ϱˆr : Q 2 D q i ϱˆr(q i ) = {d j D d j ˆRqi }. for each query q Q we have precision PˆR and recall RˆR PˆR(q) = ϱ R(q) ϱˆr(q) ϱˆr(q) RˆR(q) = ϱ R(q) ϱˆr(q) ϱ R (q). F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
8 Outline 1 Motivation Co-clustering as a DM task A model of batch ad-hoc retrieval Biclustering in IR 2 The basics of Formal Concept Analysis Definitions The Concept Lattice 3 The KFCA analysis of Confusion Matrices Representations of Confusion Matrices R min,+ -FCA of Confusion Matrices 4 Discussion and conclusions F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
9 Biclusters appear naturally for relevance relations... Consider a set of queries B 2 Q : It is natural to think of a set of documents relevant to all queries: B R = {d D q B, drq} Dually, consider a set of documents A 2 D : And the set of queries for which all documents are relevant A R = {q Q d A, drq} Clearly the following is a bicluster (A, B) such that A R = B B R = A Q: What is the organization of D and Q implied by this coclustering? F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
10 The affordances of Formal Concept Analysis Affordance 1 FCA implements the (conjunctive) Boolean model of IR[Godin et al., 1986, Valverde-Albacete, 2006]. There exists a set of keywords T (after normalization, stoplisting, stemming) Queries are represented as sets of keywords Q 2 T Documents are represented as set of keywords D 2 T Retrieval (estimated relevance) ˆR is modelled as inclusion d ˆR q q d The retrieval function is the query polar, ϱˆr (q ) = q ˆR F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
11 Outline 1 Motivation Co-clustering as a DM task A model of batch ad-hoc retrieval Biclustering in IR 2 The basics of Formal Concept Analysis Definitions The Concept Lattice 3 The KFCA analysis of Confusion Matrices Representations of Confusion Matrices R min,+ -FCA of Confusion Matrices 4 Discussion and conclusions F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
12 Formal contexts and their polars A Formal Context (D, Q, R) is a triple of: A set of objects D A set of attributes Q A (boolean) incidence relation R 2 D Q drq object d has attribute q The polars of the formal context: Given (D, Q, R) and subsets of objects A and attributes B ϕ( ) : 2 D 2 Q ψ( ) : 2 Q 2 D ϕ(a) = A R ψ(b) = B R = {q Q d A, drq} = {d D q B, drq} F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
13 Formal contexts and polars (II) The polars form an (antitone) Galois connection (ϕ, ψ) : 2 D 2 Q ϕ(a) Q B ψ(b) D A (2) The closures of the polars: monotone, expansive and idempotent γ D = ψ ϕ γ Q = ϕ ψ A 1 A 2 γ D (A 1 ) γ D (A 2 ) B 1 B 2 γ Q (B 1 ) γ Q (B 2 ) γ D (A) A γ Q (B) B γ D (γ D (A)) = γ D (A) γ Q (γ Q (B)) = γ Q (B) F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
14 Outline 1 Motivation Co-clustering as a DM task A model of batch ad-hoc retrieval Biclustering in IR 2 The basics of Formal Concept Analysis Definitions The Concept Lattice 3 The KFCA analysis of Confusion Matrices Representations of Confusion Matrices R min,+ -FCA of Confusion Matrices 4 Discussion and conclusions F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
15 Concepts The set of concepts (A, B) B(D, Q, R) ϕ(a) = B A = ψ(b) Extents and intents: let c = (A, B) B(G, M, I) ext( ) : B(D, Q, R) B(D, Q, R) int( ) : B(D, Q, R) B(D, Q, R) c = (A, B) ext(c) = A c = (A, B) int(c) = B The concept order B(D, Q, R) = B(D, Q, R), (A 1, B 1 ) (A 2, B 2 ) A 1 A 2 B 1 B 2 F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
16 The fundamental theorem of Formal Concept Analysis The Concept Lattice B(D, Q, R) is a complete lattice in which infima and suprema are given by: (A i, B i ) = [ ] R A i, B i [ i, B i ) =, i I i I(A B i, i I i I i I R i I A i]r R A complete latttice V is isomorphic to B(D, Q, R) if and only if there are mappings γ : D V γ(d) J (V) such that, drq γ(d) µ(q) In particular V = B(V, V, ). µ : Q V µ(q) M(V) F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
17 A logarithmic connection... By the fundamental theorem incidence and Concept Lattice are interchangeable: They are a pair of analysis and synthesis equations! Metaphor: The concept lattice is the exponential of the formal context. The formal concept is the logarithm of the concept lattice. F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
18 Outline 1 Motivation Co-clustering as a DM task A model of batch ad-hoc retrieval Biclustering in IR 2 The basics of Formal Concept Analysis Definitions The Concept Lattice 3 The KFCA analysis of Confusion Matrices Representations of Confusion Matrices R min,+ -FCA of Confusion Matrices 4 Discussion and conclusions F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
19 Examples: Confusion matrices of multiclass classifiers The data matrix is in a semiring N DQ N D Q N DQ p m t f th k s p m t f th k s Figure: N DQ at SNR = 0 db Notice: no symmetry, certain sparsity. F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
20 Usual representation: heatmaps F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
21 Boolean confusion matrix and lattice of confusions Boolean confusion matrices can be subjected to FCA by simply thresholding counts: Confusion lattices represent some information about CM: Stimuli in white boxes; percepts in grey. The strength of the confusion is not clear. F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
22 Outline 1 Motivation Co-clustering as a DM task A model of batch ad-hoc retrieval Biclustering in IR 2 The basics of Formal Concept Analysis Definitions The Concept Lattice 3 The KFCA analysis of Confusion Matrices Representations of Confusion Matrices R min,+ -FCA of Confusion Matrices 4 Discussion and conclusions F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
23 Data preparation We substitute N DQ N D Q by MI DQ R D Q min,+ by computing the point-wise mutual information of the count matrix N DQ : The MLE of the joint probability is ˆP DQ (a i, b j ) = n ij ij n ij, with marginals, ˆP D (a i ) = i n ij /N ˆPQ (b j ) = j n ij /N. Then the mutual information matrix becomes, ( ) ˆPDQ (a i, b j ) MI DQ (a i, b j ) = log ˆP D (a i ) ˆP Q (b j ). F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
24 Generalized Formal Concept Analysis The entries are now in the min-plus semiring: MI R D Q min,+ R p m t f th k s p m t f th k s Figure: (pointwise) mutual information from N DQ Interpretations of MI(i, j) = λ stimulus i is confused with percept j in degree λ percept j is taken for stimulus i to degree λ. F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
25 Generalized Formal Concept Analysis (cont) [Valverde-Albacete and Peláez-Moreno, 2011] A K-valued formal context is a triple (D, Q, R) K with: K, a complete, reflexive idempotent semifield two finite set of objects D and attributes Q, a K-valued incidence between them, R K D Q, where R(d, q) = λ reads as: object d has attribute q in degree λ or attribute q is manifested in object d to degree λ, F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
26 ϕ-polars Consider (D, Q, R) K, an invertible ϕ K and the bracket y x = y T R x. Then the ϕ-polars are the maps of the GC ( ) ( ) R ϕ,r ϕ ( ) : K D K Q : (y) R ϕ = (y T R) \ ϕ R ϕ K D R ϕ R ϕ(x) = ϕ / K Q (R x) γ K D Rϕ K D Y R ϕ γ K Q K Q K D = R ( ϕ K Q ) R R ϕ ϕ K Q = ( K D) R ϕ F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
27 Formal ϕ-concepts A (formal) ϕ-concept of the formal context (G, M, R) K is a pair (a, b) Y X such that (a) R ϕ = b and R ϕ(b) = a. We call: a the ϕ-extent and b the ϕ-intent of the concept (a, b), and ϕ its (minimum) degree of existence. ϕ-concepts are pairs (A, B) ϕ with similar properties to those of standard Formal Concept Analysis. ϕ R describes a minimum degree of existence required for pairs (A, B) R D min,+ RQ min,+ to be considered as members of the ϕ-lattice B ϕ (D, Q, MI DQ ) Rmin,+. F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
28 Basic theorem of K-valued Formal Concept Analysis, finite version, 1 st half The hierarchical order. If (a 1, b 1 ) (a 2, b 2 ) are ϕ-concepts, (a 1, b 1 ) (a 2, b 2 ) a 1 K D a 2 b 1 op K Q b 2 Given a reflexive, idempotent semiring (K, ϕ), the ϕ-concept lattice B ϕ (D, Q, R) K of a K-valued formal context (D, Q, R) K is a (finite, complete) lattice in which infimum and supremum are given by: (a t, b t ) = t T (a t, b t ) = t T R ϕ t T a t, R ϕ t T t T a t R ϕ b t, F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32 t T R ϕ b t
29 Challenges Theoretical The relationship with the VSM in NLP and IR is very evident. tfidf is related to mutual information (Roelleke, 2008) (k)fca is a VSM in a different algebraic setting. The entailments, very enticing: There is a concept lattice structure underlying the VSM. There is an actual topology of information that is finer than the discrete topology. kfca actually shows how IR and IF are two sides of the same coin. The development of idempotent semiring algebra is way behind that of normal algebra (e.g. no known SVD, so idempotent LSI is unavailable). F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
30 Challenges Practical The complexity of CL building algorithms is not good: O(DQK ) where K is the number of concepts in the lattice. But Big Data techniques may be of great help. Most toolkits deal with the dense context case, which for us is less interesting. The theory is agnostic with respect to the interpretations of D and Q. This is a mixed blessing. F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
31 Summary (K)FCA as a coclustering strategy... KFCA does not try to solve the original (direct clustering) task. But it provides an alternative look into the task that makes it more realistic and varied: Deals naturally with lack of symmetry (confusion matrices) Deals naturally with data with many objects/few attributes (GED data) or viceversa (itemset analysis). Most of the advantages stem from: A very solid theory (FCA). A deep understanding of the maths behind (order lattice theory). Appropriateness of use: KFCA deals with counts, probabilities, concentrations: all positive quantities. F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
32 Thank you! F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
33 Norbert Fuhr. Probabilistic models or information retrieval. The Computer Journal, 35(3): , R Godin, E Saunders, and Jan Gecsei. Lattice model of browsable data spaces. Information Sciences, 40:89 116, J Hartigan. Direct clustering of a data matrix. Journal of the American Statistical Association, Jan Boris Mirkin. Mathematical Classification and Clustering, volume 11 of Nonconvex Optimization and Its Applications. Kluwer Academic Publishers, Francisco J. Valverde-Albacete. Combining soft and hard techniques for the analysis of batch retrieval tasks. In Enrique Herrera-Viedma, Gabriella Pasi, and Fabio Crestani, editors, Soft Computing for Information Retrieval on the Web. Models and Applications, volume 197 of Studies in Fuzziness and Soft Computing, pages Springer, Francisco J. Valverde-Albacete and Carmen Peláez-Moreno. Extending conceptualisation modes for generalised Formal Concept Analysis. Information Sciences, 181: , May F.J. Valverde (NLP&IR, UNED) From FCA to kfca NLP&IR / 32
Rank one SVD: un algorithm pour la visualisation d une matrice non négative
Rank one SVD: un algorithm pour la visualisation d une matrice non négative L. Labiod and M. Nadif LIPADE - Universite ParisDescartes, France ECAIS 2013 November 7, 2013 Outline Outline 1 Data visualization
More informationFUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM
International Journal of Innovative Computing, Information and Control ICIC International c 0 ISSN 34-48 Volume 8, Number 8, August 0 pp. 4 FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT
More informationMathematical finance and linear programming (optimization)
Mathematical finance and linear programming (optimization) Geir Dahl September 15, 2009 1 Introduction The purpose of this short note is to explain how linear programming (LP) (=linear optimization) may
More information6. Cholesky factorization
6. Cholesky factorization EE103 (Fall 2011-12) triangular matrices forward and backward substitution the Cholesky factorization solving Ax = b with A positive definite inverse of a positive definite matrix
More informationDirect Methods for Solving Linear Systems. Matrix Factorization
Direct Methods for Solving Linear Systems Matrix Factorization Numerical Analysis (9th Edition) R L Burden & J D Faires Beamer Presentation Slides prepared by John Carroll Dublin City University c 2011
More informationIRREDUCIBLE OPERATOR SEMIGROUPS SUCH THAT AB AND BA ARE PROPORTIONAL. 1. Introduction
IRREDUCIBLE OPERATOR SEMIGROUPS SUCH THAT AB AND BA ARE PROPORTIONAL R. DRNOVŠEK, T. KOŠIR Dedicated to Prof. Heydar Radjavi on the occasion of his seventieth birthday. Abstract. Let S be an irreducible
More information1 o Semestre 2007/2008
Departamento de Engenharia Informática Instituto Superior Técnico 1 o Semestre 2007/2008 Outline 1 2 3 4 5 Outline 1 2 3 4 5 Exploiting Text How is text exploited? Two main directions Extraction Extraction
More informationClassify then Summarize or Summarize then Classify
Classify then Summarize or Summarize then Classify DIMACS, Rutgers University Piscataway, NJ 08854 Workshop Honoring Edwin Diday held on September 4, 2007 What is Cluster Analysis? Software package? Collection
More informationConvex analysis and profit/cost/support functions
CALIFORNIA INSTITUTE OF TECHNOLOGY Division of the Humanities and Social Sciences Convex analysis and profit/cost/support functions KC Border October 2004 Revised January 2009 Let A be a subset of R m
More informationSOLUTIONS TO ASSIGNMENT 1 MATH 576
SOLUTIONS TO ASSIGNMENT 1 MATH 576 SOLUTIONS BY OLIVIER MARTIN 13 #5. Let T be the topology generated by A on X. We want to show T = J B J where B is the set of all topologies J on X with A J. This amounts
More informationElements of Abstract Group Theory
Chapter 2 Elements of Abstract Group Theory Mathematics is a game played according to certain simple rules with meaningless marks on paper. David Hilbert The importance of symmetry in physics, and for
More informationDATA ANALYSIS II. Matrix Algorithms
DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where
More informationA domain of spacetime intervals in general relativity
A domain of spacetime intervals in general relativity Keye Martin Department of Mathematics Tulane University New Orleans, LA 70118 United States of America martin@math.tulane.edu Prakash Panangaden School
More informationChapter 6. Orthogonality
6.3 Orthogonal Matrices 1 Chapter 6. Orthogonality 6.3 Orthogonal Matrices Definition 6.4. An n n matrix A is orthogonal if A T A = I. Note. We will see that the columns of an orthogonal matrix must be
More informationSimilarity and Diagonalization. Similar Matrices
MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that
More informationBig Ideas in Mathematics
Big Ideas in Mathematics which are important to all mathematics learning. (Adapted from the NCTM Curriculum Focal Points, 2006) The Mathematics Big Ideas are organized using the PA Mathematics Standards
More informationSOLVING LINEAR SYSTEMS
SOLVING LINEAR SYSTEMS Linear systems Ax = b occur widely in applied mathematics They occur as direct formulations of real world problems; but more often, they occur as a part of the numerical analysis
More information1 Symmetries of regular polyhedra
1230, notes 5 1 Symmetries of regular polyhedra Symmetry groups Recall: Group axioms: Suppose that (G, ) is a group and a, b, c are elements of G. Then (i) a b G (ii) (a b) c = a (b c) (iii) There is an
More informationNon-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning
Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning SAMSI 10 May 2013 Outline Introduction to NMF Applications Motivations NMF as a middle step
More informationCalculation of Minimum Distances. Minimum Distance to Means. Σi i = 1
Minimum Distance to Means Similar to Parallelepiped classifier, but instead of bounding areas, the user supplies spectral class means in n-dimensional space and the algorithm calculates the distance between
More informationLecture Topic: Low-Rank Approximations
Lecture Topic: Low-Rank Approximations Low-Rank Approximations We have seen principal component analysis. The extraction of the first principle eigenvalue could be seen as an approximation of the original
More informationDegree Hypergroupoids Associated with Hypergraphs
Filomat 8:1 (014), 119 19 DOI 10.98/FIL1401119F Published by Faculty of Sciences and Mathematics, University of Niš, Serbia Available at: http://www.pmf.ni.ac.rs/filomat Degree Hypergroupoids Associated
More information1 Sets and Set Notation.
LINEAR ALGEBRA MATH 27.6 SPRING 23 (COHEN) LECTURE NOTES Sets and Set Notation. Definition (Naive Definition of a Set). A set is any collection of objects, called the elements of that set. We will most
More informationMachine Learning and Data Analysis overview. Department of Cybernetics, Czech Technical University in Prague. http://ida.felk.cvut.
Machine Learning and Data Analysis overview Jiří Kléma Department of Cybernetics, Czech Technical University in Prague http://ida.felk.cvut.cz psyllabus Lecture Lecturer Content 1. J. Kléma Introduction,
More informationSearch Taxonomy. Web Search. Search Engine Optimization. Information Retrieval
Information Retrieval INFO 4300 / CS 4300! Retrieval models Older models» Boolean retrieval» Vector Space model Probabilistic Models» BM25» Language models Web search» Learning to Rank Search Taxonomy!
More informationBookTOC.txt. 1. Functions, Graphs, and Models. Algebra Toolbox. Sets. The Real Numbers. Inequalities and Intervals on the Real Number Line
College Algebra in Context with Applications for the Managerial, Life, and Social Sciences, 3rd Edition Ronald J. Harshbarger, University of South Carolina - Beaufort Lisa S. Yocco, Georgia Southern University
More informationMultiscale Object-Based Classification of Satellite Images Merging Multispectral Information with Panchromatic Textural Features
Remote Sensing and Geoinformation Lena Halounová, Editor not only for Scientific Cooperation EARSeL, 2011 Multiscale Object-Based Classification of Satellite Images Merging Multispectral Information with
More informationUnsupervised Data Mining (Clustering)
Unsupervised Data Mining (Clustering) Javier Béjar KEMLG December 01 Javier Béjar (KEMLG) Unsupervised Data Mining (Clustering) December 01 1 / 51 Introduction Clustering in KDD One of the main tasks in
More informationDecember 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B. KITCHENS
December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B KITCHENS The equation 1 Lines in two-dimensional space (1) 2x y = 3 describes a line in two-dimensional space The coefficients of x and y in the equation
More informationOn Integer Additive Set-Indexers of Graphs
On Integer Additive Set-Indexers of Graphs arxiv:1312.7672v4 [math.co] 2 Mar 2014 N K Sudev and K A Germina Abstract A set-indexer of a graph G is an injective set-valued function f : V (G) 2 X such that
More informationCluster analysis and Association analysis for the same data
Cluster analysis and Association analysis for the same data Huaiguo Fu Telecommunications Software & Systems Group Waterford Institute of Technology Waterford, Ireland hfu@tssg.org Abstract: Both cluster
More informationo-minimality and Uniformity in n 1 Graphs
o-minimality and Uniformity in n 1 Graphs Reid Dale July 10, 2013 Contents 1 Introduction 2 2 Languages and Structures 2 3 Definability and Tame Geometry 4 4 Applications to n 1 Graphs 6 5 Further Directions
More informationDiscernibility Thresholds and Approximate Dependency in Analysis of Decision Tables
Discernibility Thresholds and Approximate Dependency in Analysis of Decision Tables Yu-Ru Syau¹, En-Bing Lin²*, Lixing Jia³ ¹Department of Information Management, National Formosa University, Yunlin, 63201,
More informationLinear Algebra Methods for Data Mining
Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 Text mining & Information Retrieval Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki
More informationBOOLEAN CONSENSUS FOR SOCIETIES OF ROBOTS
Workshop on New frontiers of Robotics - Interdep. Research Center E. Piaggio June 2-22, 22 - Pisa (Italy) BOOLEAN CONSENSUS FOR SOCIETIES OF ROBOTS Adriano Fagiolini DIEETCAM, College of Engineering, University
More informationMAT188H1S Lec0101 Burbulla
Winter 206 Linear Transformations A linear transformation T : R m R n is a function that takes vectors in R m to vectors in R n such that and T (u + v) T (u) + T (v) T (k v) k T (v), for all vectors u
More information1 Solving LPs: The Simplex Algorithm of George Dantzig
Solving LPs: The Simplex Algorithm of George Dantzig. Simplex Pivoting: Dictionary Format We illustrate a general solution procedure, called the simplex algorithm, by implementing it on a very simple example.
More information15.062 Data Mining: Algorithms and Applications Matrix Math Review
.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop
More informationMA651 Topology. Lecture 6. Separation Axioms.
MA651 Topology. Lecture 6. Separation Axioms. This text is based on the following books: Fundamental concepts of topology by Peter O Neil Elements of Mathematics: General Topology by Nicolas Bourbaki Counterexamples
More informationSupport Vector Machines with Clustering for Training with Very Large Datasets
Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France theodoros.evgeniou@insead.fr Massimiliano
More informationLINEAR ALGEBRA W W L CHEN
LINEAR ALGEBRA W W L CHEN c W W L Chen, 1997, 2008 This chapter is available free to all individuals, on understanding that it is not to be used for financial gain, and may be downloaded and/or photocopied,
More informationInner Product Spaces and Orthogonality
Inner Product Spaces and Orthogonality week 3-4 Fall 2006 Dot product of R n The inner product or dot product of R n is a function, defined by u, v a b + a 2 b 2 + + a n b n for u a, a 2,, a n T, v b,
More informationA Toolbox for Bicluster Analysis in R
Sebastian Kaiser and Friedrich Leisch A Toolbox for Bicluster Analysis in R Technical Report Number 028, 2008 Department of Statistics University of Munich http://www.stat.uni-muenchen.de A Toolbox for
More informationThe Determinant: a Means to Calculate Volume
The Determinant: a Means to Calculate Volume Bo Peng August 20, 2007 Abstract This paper gives a definition of the determinant and lists many of its well-known properties Volumes of parallelepipeds are
More informationZachary Monaco Georgia College Olympic Coloring: Go For The Gold
Zachary Monaco Georgia College Olympic Coloring: Go For The Gold Coloring the vertices or edges of a graph leads to a variety of interesting applications in graph theory These applications include various
More informationDiscussion on the paper Hypotheses testing by convex optimization by A. Goldenschluger, A. Juditsky and A. Nemirovski.
Discussion on the paper Hypotheses testing by convex optimization by A. Goldenschluger, A. Juditsky and A. Nemirovski. Fabienne Comte, Celine Duval, Valentine Genon-Catalot To cite this version: Fabienne
More informationTECHNIQUES FOR OPTIMIZING THE RELATIONSHIP BETWEEN DATA STORAGE SPACE AND DATA RETRIEVAL TIME FOR LARGE DATABASES
Techniques For Optimizing The Relationship Between Data Storage Space And Data Retrieval Time For Large Databases TECHNIQUES FOR OPTIMIZING THE RELATIONSHIP BETWEEN DATA STORAGE SPACE AND DATA RETRIEVAL
More informationLecture 1: Systems of Linear Equations
MTH Elementary Matrix Algebra Professor Chao Huang Department of Mathematics and Statistics Wright State University Lecture 1 Systems of Linear Equations ² Systems of two linear equations with two variables
More informationHow will the programme be delivered (e.g. inter-institutional, summerschools, lectures, placement, rotations, on-line etc.):
Titles of Programme: Hamilton Hamilton Institute Institute Structured PhD Structured PhD Minimum 30 credits. 15 of Programme which must be obtained from Generic/Transferable skills modules and 15 from
More informationA REMARK ON ALMOST MOORE DIGRAPHS OF DEGREE THREE. 1. Introduction and Preliminaries
Acta Math. Univ. Comenianae Vol. LXVI, 2(1997), pp. 285 291 285 A REMARK ON ALMOST MOORE DIGRAPHS OF DEGREE THREE E. T. BASKORO, M. MILLER and J. ŠIRÁŇ Abstract. It is well known that Moore digraphs do
More informationDiscrete Mathematics. Hans Cuypers. October 11, 2007
Hans Cuypers October 11, 2007 1 Contents 1. Relations 4 1.1. Binary relations................................ 4 1.2. Equivalence relations............................. 6 1.3. Relations and Directed Graphs.......................
More informationHow To Understand And Solve Algebraic Equations
College Algebra Course Text Barnett, Raymond A., Michael R. Ziegler, and Karl E. Byleen. College Algebra, 8th edition, McGraw-Hill, 2008, ISBN: 978-0-07-286738-1 Course Description This course provides
More informationUNCOUPLING THE PERRON EIGENVECTOR PROBLEM
UNCOUPLING THE PERRON EIGENVECTOR PROBLEM Carl D Meyer INTRODUCTION Foranonnegative irreducible matrix m m with spectral radius ρ,afundamental problem concerns the determination of the unique normalized
More informationWarshall s Algorithm: Transitive Closure
CS 0 Theory of Algorithms / CS 68 Algorithms in Bioinformaticsi Dynamic Programming Part II. Warshall s Algorithm: Transitive Closure Computes the transitive closure of a relation (Alternatively: all paths
More informationAlgebra Unpacked Content For the new Common Core standards that will be effective in all North Carolina schools in the 2012-13 school year.
This document is designed to help North Carolina educators teach the Common Core (Standard Course of Study). NCDPI staff are continually updating and improving these tools to better serve teachers. Algebra
More informationComparison of Standard and Zipf-Based Document Retrieval Heuristics
Comparison of Standard and Zipf-Based Document Retrieval Heuristics Benjamin Hoffmann Universität Stuttgart, Institut für Formale Methoden der Informatik Universitätsstr. 38, D-70569 Stuttgart, Germany
More informationPerformance Metrics for Graph Mining Tasks
Performance Metrics for Graph Mining Tasks 1 Outline Introduction to Performance Metrics Supervised Learning Performance Metrics Unsupervised Learning Performance Metrics Optimizing Metrics Statistical
More informationText Analytics (Text Mining)
CSE 6242 / CX 4242 Apr 3, 2014 Text Analytics (Text Mining) LSI (uses SVD), Visualization Duen Horng (Polo) Chau Georgia Tech Some lectures are partly based on materials by Professors Guy Lebanon, Jeffrey
More informationAn Application of Visual Cryptography To Financial Documents
An Application of Visual Cryptography To Financial Documents L. W. Hawkes, A. Yasinsac, C. Cline Security and Assurance in Information Technology Laboratory Computer Science Department Florida State University
More informationThe Theory of Concept Analysis and Customer Relationship Mining
The Application of Association Rule Mining in CRM Based on Formal Concept Analysis HongSheng Xu * and Lan Wang College of Information Technology, Luoyang Normal University, Luoyang, 471022, China xhs_ls@sina.com
More informationLinear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
More informationMining Social-Network Graphs
342 Chapter 10 Mining Social-Network Graphs There is much information to be gained by analyzing the large-scale data that is derived from social networks. The best-known example of a social network is
More informationNMR Measurement of T1-T2 Spectra with Partial Measurements using Compressive Sensing
NMR Measurement of T1-T2 Spectra with Partial Measurements using Compressive Sensing Alex Cloninger Norbert Wiener Center Department of Mathematics University of Maryland, College Park http://www.norbertwiener.umd.edu
More informationTransportation Polytopes: a Twenty year Update
Transportation Polytopes: a Twenty year Update Jesús Antonio De Loera University of California, Davis Based on various papers joint with R. Hemmecke, E.Kim, F. Liu, U. Rothblum, F. Santos, S. Onn, R. Yoshida,
More informationMATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m
MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +
More informationChapter ML:IV. IV. Statistical Learning. Probability Basics Bayes Classification Maximum a-posteriori Hypotheses
Chapter ML:IV IV. Statistical Learning Probability Basics Bayes Classification Maximum a-posteriori Hypotheses ML:IV-1 Statistical Learning STEIN 2005-2015 Area Overview Mathematics Statistics...... Stochastics
More informationSolution of Linear Systems
Chapter 3 Solution of Linear Systems In this chapter we study algorithms for possibly the most commonly occurring problem in scientific computing, the solution of linear systems of equations. We start
More informationRow Echelon Form and Reduced Row Echelon Form
These notes closely follow the presentation of the material given in David C Lay s textbook Linear Algebra and its Applications (3rd edition) These notes are intended primarily for in-class presentation
More informationInformation Retrieval and Web Search Engines
Information Retrieval and Web Search Engines Lecture 7: Document Clustering December 10 th, 2013 Wolf-Tilo Balke and Kinda El Maarry Institut für Informationssysteme Technische Universität Braunschweig
More informationAlgebraic models for social networks. Philippa Pattison Department of Psychology University of Melbourne Parkville, Victoria Australia.
Algebraic models for social networks Philippa Pattison Department of Psychology University of Melbourne Parkville, Victoria Australia Outline 1 Glossary 2 Definition of the subject and its importance 3
More informationPÓLYA URN MODELS UNDER GENERAL REPLACEMENT SCHEMES
J. Japan Statist. Soc. Vol. 31 No. 2 2001 193 205 PÓLYA URN MODELS UNDER GENERAL REPLACEMENT SCHEMES Kiyoshi Inoue* and Sigeo Aki* In this paper, we consider a Pólya urn model containing balls of m different
More informationGroup Theory. Contents
Group Theory Contents Chapter 1: Review... 2 Chapter 2: Permutation Groups and Group Actions... 3 Orbits and Transitivity... 6 Specific Actions The Right regular and coset actions... 8 The Conjugation
More informationMATH1231 Algebra, 2015 Chapter 7: Linear maps
MATH1231 Algebra, 2015 Chapter 7: Linear maps A/Prof. Daniel Chan School of Mathematics and Statistics University of New South Wales danielc@unsw.edu.au Daniel Chan (UNSW) MATH1231 Algebra 1 / 43 Chapter
More informationENRICHED CATEGORIES AND THE FLOYD-WARSHALL CONNECTION Vaughan Pratt Computer Science Dept. Stanford University April, 1989
ENRICHED CATEGORIES AND THE FLOYD-WARSHALL CONNECTION Vaughan Pratt Computer Science Dept. Stanford University April, 1989 Abstract We give a correspondence between enriched categories and the Gauss-Kleene-Floyd-Warshall
More informationSEARCHING AND KNOWLEDGE REPRESENTATION. Angel Garrido
Acta Universitatis Apulensis ISSN: 1582-5329 No. 30/2012 pp. 147-152 SEARCHING AND KNOWLEDGE REPRESENTATION Angel Garrido ABSTRACT. The procedures of searching of solutions of problems, in Artificial Intelligence
More informationContent-Based Recommendation
Content-Based Recommendation Content-based? Item descriptions to identify items that are of particular interest to the user Example Example Comparing with Noncontent based Items User-based CF Searches
More informationPayment streams and variable interest rates
Chapter 4 Payment streams and variable interest rates In this chapter we consider two extensions of the theory Firstly, we look at payment streams A payment stream is a payment that occurs continuously,
More informationSearch and Information Retrieval
Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search
More informationInner Product Spaces
Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and
More information2.1 Introduction. 2.2 Terms and definitions
.1 Introduction An important step in the procedure for solving any circuit problem consists first in selecting a number of independent branch currents as (known as loop currents or mesh currents) variables,
More informationHigh degree graphs contain large-star factors
High degree graphs contain large-star factors Dedicated to László Lovász, for his 60th birthday Noga Alon Nicholas Wormald Abstract We show that any finite simple graph with minimum degree d contains a
More informationNotes on Determinant
ENGG2012B Advanced Engineering Mathematics Notes on Determinant Lecturer: Kenneth Shum Lecture 9-18/02/2013 The determinant of a system of linear equations determines whether the solution is unique, without
More informationPre-Calculus Semester 1 Course Syllabus
Pre-Calculus Semester 1 Course Syllabus The Plano ISD eschool Mission is to create a borderless classroom based on a positive student-teacher relationship that fosters independent, innovative critical
More informationSpatial Statistics Chapter 3 Basics of areal data and areal data modeling
Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data
More informationA linear combination is a sum of scalars times quantities. Such expressions arise quite frequently and have the form
Section 1.3 Matrix Products A linear combination is a sum of scalars times quantities. Such expressions arise quite frequently and have the form (scalar #1)(quantity #1) + (scalar #2)(quantity #2) +...
More informationThe Characteristic Polynomial
Physics 116A Winter 2011 The Characteristic Polynomial 1 Coefficients of the characteristic polynomial Consider the eigenvalue problem for an n n matrix A, A v = λ v, v 0 (1) The solution to this problem
More informationLatent Semantic Indexing with Selective Query Expansion Abstract Introduction
Latent Semantic Indexing with Selective Query Expansion Andy Garron April Kontostathis Department of Mathematics and Computer Science Ursinus College Collegeville PA 19426 Abstract This article describes
More informationMATRIX ALGEBRA AND SYSTEMS OF EQUATIONS
MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS Systems of Equations and Matrices Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a
More information= 2 + 1 2 2 = 3 4, Now assume that P (k) is true for some fixed k 2. This means that
Instructions. Answer each of the questions on your own paper, and be sure to show your work so that partial credit can be adequately assessed. Credit will not be given for answers (even correct ones) without
More informationSoft Clustering with Projections: PCA, ICA, and Laplacian
1 Soft Clustering with Projections: PCA, ICA, and Laplacian David Gleich and Leonid Zhukov Abstract In this paper we present a comparison of three projection methods that use the eigenvectors of a matrix
More informationLet H and J be as in the above lemma. The result of the lemma shows that the integral
Let and be as in the above lemma. The result of the lemma shows that the integral ( f(x, y)dy) dx is well defined; we denote it by f(x, y)dydx. By symmetry, also the integral ( f(x, y)dx) dy is well defined;
More informationClassification of Cartan matrices
Chapter 7 Classification of Cartan matrices In this chapter we describe a classification of generalised Cartan matrices This classification can be compared as the rough classification of varieties in terms
More informationCENG 734 Advanced Topics in Bioinformatics
CENG 734 Advanced Topics in Bioinformatics Week 9 Text Mining for Bioinformatics: BioCreative II.5 Fall 2010-2011 Quiz #7 1. Draw the decompressed graph for the following graph summary 2. Describe the
More informationAPPLICATION OF ICT BENEFITS FOR BUILDING PROJECT MANAGEMENT USING ISM MODEL
APPLICATION OF ICT BENEFITS FOR BUILDING PROJECT MANAGEMENT USING ISM MODEL S.V.S.N.D.L.Prasanna 1, T. Raja Ramanna 2 1 Assistant Professor, Civil Engineering Department, University College of Engineering,
More information1 = (a 0 + b 0 α) 2 + + (a m 1 + b m 1 α) 2. for certain elements a 0,..., a m 1, b 0,..., b m 1 of F. Multiplying out, we obtain
Notes on real-closed fields These notes develop the algebraic background needed to understand the model theory of real-closed fields. To understand these notes, a standard graduate course in algebra is
More informationON THE DEGREES OF FREEDOM OF SIGNALS ON GRAPHS. Mikhail Tsitsvero and Sergio Barbarossa
ON THE DEGREES OF FREEDOM OF SIGNALS ON GRAPHS Mikhail Tsitsvero and Sergio Barbarossa Sapienza Univ. of Rome, DIET Dept., Via Eudossiana 18, 00184 Rome, Italy E-mail: tsitsvero@gmail.com, sergio.barbarossa@uniroma1.it
More informationMathematics Course 111: Algebra I Part IV: Vector Spaces
Mathematics Course 111: Algebra I Part IV: Vector Spaces D. R. Wilkins Academic Year 1996-7 9 Vector Spaces A vector space over some field K is an algebraic structure consisting of a set V on which are
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
More informationBig Data Technology Motivating NoSQL Databases: Computing Page Importance Metrics at Crawl Time
Big Data Technology Motivating NoSQL Databases: Computing Page Importance Metrics at Crawl Time Edward Bortnikov & Ronny Lempel Yahoo! Labs, Haifa Class Outline Link-based page importance measures Why
More informationClustering. Danilo Croce Web Mining & Retrieval a.a. 2015/201 16/03/2016
Clustering Danilo Croce Web Mining & Retrieval a.a. 2015/201 16/03/2016 1 Supervised learning vs. unsupervised learning Supervised learning: discover patterns in the data that relate data attributes with
More information