ShapeWIZ - An approach to automatic shape recognition Joaquim M. C. Costa 12 1 Escola Superior de Enxeñaria Informática de Ourense 2 Universidade de Vigo costa.joaquim@uvigo.es Abstract. An approach for an automatic shape recognition scheme is described. The entire process has eight stages: Reading, Clustering, Ordering, Simplifying, Guessing, Adapting, Evaluating, and possibly Visualizing the results. We can describe our scheme as follows: At first, an unorganized set of points is given (the Reading phase). Next, after the parameters are chosen by the user the scheme is supposed to classify all of the points, (the Clustering phase) in separate subsets. Next, all the points of each subset will be ordered (the Ordering phase). After that, each subset will be approximated (the Simplifying phase) and the output shape fitted (the Guessing phase). Following, since we have polygonal shapes guessed they will be, interactively, adapted, one at a time, until the goal is achieved. Finally, when needed the results are shown (the Visualization phase). Keywords: Curve reconstruction, geometric clustering, alpha shapes, polygonal approximation, automatic shape recognition.
2 Resumen en Castellano Este trabajo tiene como objetivo principal hacer una primera aproximación al desarrollo de un sistema informático para el reconocimiento automático de formas, utilizando algoritmos y sus parámetros adecuadamente. Este trabajo fue realizado en el grupo LIA (Laboratorio de Informática Aplicada) y pretende ser una contribución más en la labor de reconocimiento automático de formas. El sistema constará de tres fases básicas. Para una mejor comprensión de lo que nos proponemos demostrar a continuación, se visualizan las fases (véase Fig. 1, Fig. 2 y Fig. 3. Figura 1. La primera fase (Leer el conjunto de puntos) El reconocimiento automático de formas tendrá aplicación directa en muchas áreas, tales como la visión por ordenador, la segmentación de imágenes, la identificación de objetos dentro de imágenes, reconocimiento de patrones, identificación de medios para niños y discapacitados en un modelo de aprendizaje autónomo, y sistemas de información geográfica. Podemos decir que este trabajo está en el ámbito de la Unidad de Gráficos y Geometría Computacional y su desarrollo requiere habilidades y conocimientos en las áreas de algoritmia y matemáticas. Se necesita conocimiento básico sobre la representación de puntos, líneas, rayos, segmentos de líneas, curvas y polígonos en el plano euclidiano en dos dimensiones. Para ello, tenemos el sistema de coordenadas cartesianas y polares. El sistema cartesiano está formado por un par de ejes perpendiculares que se cruzan en un punto que llamamos origen. Por lo general, tenemos un eje horizontal (el eje de abscisas o x) y un eje vertical (el eje de ordenadas o eje y). En este punto un sistema viene representado únicamente por un par de valores (la coordenada x y y coordenadas), que son respectivamente la distancia al origen.
3 Figura 2. adecuada) La segunda fase (Elección de los algoritmos y sus parámetros de forma Figura 3. La tercera fase (Muestra los resultados)
4 El sistema de coordenadas polares se compone de un punto fijo, que se llama centro y un radio, que se llama el eje polar, con una dirección fija, por lo general elaborado en horizontal y apuntando a la derecha. Un punto también está representado por dos valores: la distancia, por lo general llamado r, que se mide desde el punto al polo, y el ángulo que hace que el eje polar, también conocido comúnmente como θ. Hemos tomado nota de que, por su representación de acuerdo con el enfoque algorítmico que se haga puede ser necesario convertir los valores de coordenadas de un sistema a otro, utilizando fórmulas trigonométricas para este fin. El uso de estos conceptos podemos definir todas las entidades algebráicas primitivas geométricas tales como puntos, líneas, segmentos de recta, líneas poligonales abiertas y cerradas (polígonos) y las curvas abiertas y cerradas. Para resolver problemas de Geometría Computacional, son cruciales los términos con cierta información básica, como, por ejemplo: dados dos puntos arbitrarios en el plano qué significa estar lejos el uno del otro; qué es la distancia entre un punto y un vértice de un polígono; si uno punto está dentro o fuera de un círculo o un triángulo; si un punto está a la izquierda o a la derecha de una línea; o si el punto coincide con un segmento de línea, etc. Este tipo de problema se resuelve con medidas de distancias o métricas. Hay diversas medidas, tales como L 1 o distancia de Manhattan, L 2 o distancia euclídea, L o la distancia máxima y la distancia vertical, entre otras. Cualquiera de estas métricas se pueden utilizar para calcular todo tipo de distancias: de un punto a otro punto, de un punto de una línea, de un punto a un segmento, de un punto a un polígono, o de un punto a un círculo o un triángulo, etc. Además de la utilización de entidades geométricas primitivas, cuando desarrollamos algoritmos para resolver problemas geométricos, también utilizamos las estructuras de datos geométricos, entre otras: Convex Hull (véase Fig. 4 e Fig. 23), diagramas de Voronoi (véase Fig. 6), la triangulación de Delaunay (véase Fig. 7), grafos Gabriel, grafos de vecinidad, grafos de vecinidad relative, grafos de k-vecinidad, etc. En la Geometría Computacional se puede definir como el proceso de agrupación geométrica que permite la separación de puntos en un clúster o de un conjunto de puntos en el plano, utilizando un criterio o una condición previamente definido, por ejemplo, la distancia geométrica entre los puntos para la reconstrucción de las curvas relativas a sus componentes (si hay más de uno). A menos que el conjunto de puntos en la muestra es suficientemente denso y uniforme, es obligatorio proporcionar un criterio. La reconstrucción de las curvas es el proceso que conecta los puntos de forma secuencial por segmentos de línea. La estructura de datos fundamental para la reconstrucción de las curvas es la triangulación de Delaunay, que a su vez se basa en el diagrama de Voronoi, y es su grafo dual. En los últimos treinta años se han desarrollado con éxito algunos algoritmos para la reconstrucción de las curvas progresivo, utilizando un criterio de elimi-
5 (a) Convexidad (b) No Convexidad Figura 4. Convexidad y no convexidad Figura 5. Convex Hull (o Envoltura convexa) Figura 6. Un Diagrama de Voronoi
6 Figura 7. Un diagrama de Voronoi y su triangulación de Delaunay nación de los bordes a través de la filtración de la triangualación de Delaunay con el fin de dar una mejor representación de las formas presentados, de los cuales queremos destacar los que estudiamos en detalle: α-shapes β-skeleton Crust Gathan El sistema puede estar representada por el siguiente diagrama (véase Fig. 8). Los pasos más importantes son: #2-Agrupación; #3-Ordenación; #4-Simplificación y #6-Adaptación. Este trabajo, como ya se dijo, tiene como objetivo ser una extensión de los trabajos (proyectos fin de carrera) de David Reboiro Jato y Pedro Silva Calveiro. Su objetivo principal es describir un sistema de reconocimiento automático de formas simples, sin intersecciones o superposiciones. Reconocer líneas rectas, círculos y polígonos con tres o más vértices, hasta siete inclusive. El sistema constará de acciones, utilizando dos sistemas ya desarrollados: GathanViewer y Sharec. En cada uno de los pasos en el proceso general es posible leer y escribir archivos de texto con los datos relativos al número de puntos examinados. En el primer paso, se utiliza el GathanViewer para importar el conjunto de puntos, o bien sea por la carga de un archivo de texto con formato, o bien sea mediante un clic del ratón en la pantalla. En el segundo paso hay que elegir uno de los algoritmos disponibles, y ajustar los parámetros. Luego se aplique un procedimiento para encontrar los componentes conectados y, en su caso, verificar que no exista más de un componente dentro del conjunto de puntos. Entonces se graban los datos de los puntos o los bordes filtrados en archivos de texto. Se aplicará a cada componente un proceso de simplificación. A continuación, en el paso de adivinar cada componente se somete a un proceso de reconocimiento
7 Figura 8. Diagrama de flujo ShapeWIZ de formas, a partir del segmento de línea que pasa por el triángulo, polígono de tres hasta siete vértices, y también por el círculo. A cada una de estas formas se aplica el proceso de adaptación. Cuando el proceso termina el valor mínimo de regresar de la función de adaptación nos da la más probable para representar el conjunto de puntos y, a continuación se muestran los resultados. Venimos para hacer referencia a nuestra investigación en las áreas que consideramos más importante. En cuanto a la agrupación, los algoritmos que ya han sido mencionados también son una manera de agrupación. En general, la agrupación se considera un problema de aprendizaje no supervisado. Su objetivo es encontrar una estructura en un conjunto de datos no etiquetados. Es el proceso de organizar objetos en grupos cuyos miembros tienen algún tipo de similitud entre sí (veáse Fig. 9). Podemos clasificar los diferentes tipos de agrupación en cuatro grupos: agrupación exclusiva, agrupación de superposición, agrupación jerárquica, y agrupación probabilístico. En la agrupación exclusiva un elemento sólo puede pertenecer únicamente a un grupo. La agrupación de superposición de conjuntos borrosos se utilizan para agrupar los datos donde un elemento puede pertenecer simultáneamente a más de un grupo, aunque con distintos grados de pertenencia. En el caso de la agrupación jerárquica, el proceso se inicia con todos los elementos. A continuación, en cada iteración del algoritmo, los elementos se agrupan de acuerdo a un criterio establecido para alcanzar el número deseado de grupos. La agrupación probabilistico se utiliza en un enfoque probabilístico a la hora de la formación de grupos. Para calcular estos grupos se necesita métricas. Por lo general son más utilizadas la L 1, la L 2 y la métrica Minkowsky, donde la métrica L 1 y L 2 son casos especiales en que p = 1 y p = 2, respectivamente.
8 Figura 9. Agrupación basada en distancias En la Geometría Computacional, podemos considerar la ordenación como el proceso de encontrar el camino más corto en un determinado conjunto de puntos unidos bajo un determinado criterio. Esto significa que todos los vértices están conectados por segmentos de línea que une los vértices en secuencia. Los algoritmos de filtrado que hemos visto anteriormente son también en cierto modo una forma de conseguir una ordenación. El problema típico de la ordenación de un grafo es un problema NP-completo (relacionado con el problema del viajante, TSP). En lo que respecta al paso tres, definimos la simplificación o aproximación de una curva de la siguiente manera: dada una curva poligonal representada por varios puntos, tenemos la intención de obtener otra curva representada por el menor número de segmentos de línea. Podemos dividir las cuestiones de la simplificación de las curvas en problemas mín # y problemas mín ɛ. El problema min # cuando se les da una curva con n segmentos de línea y un entero m < n, y debe encontrar una nueva curva con m segmentos de línea que minimiza el error debajo de un ɛ dado. El problema min ɛ se obtiene cuando se les da una curva y una tolerancia ɛ, y hay que encontrar otra curva que puede ser repesentada por otra línea con m segmentos m < n cuyo error no supera el margen de tolerancia determinado ɛ. También tenemos dos tipos de enfoque en cuanto a su constitución, cuando las curvas son abiertas y cuando están cerradas. Para trabajar con estos enfoques se usan métricas de error, sobre la base de las medidas de distancia que hemos mencionado antes. Se han estudiado varios algoritmos, encontrando que hay varios tipos de enfoques como la programación dinámica y el enfoque heurístico (véase Cuadro. 1, Cuadro. 2) y véase Cuadro. 3. En cuanto a los algoritmos de adaptación se utiliza es el algoritmo de optimización basado en la minimización de García Palomares que ya se ha desplegado
9 Algoritmos de programación dinámica Presentado por Fecha Stone[1] 1961 Bellman[2] 1961 Gluss[3] 1962 Lawson[4] 1964 Cox[5] 1971 Cantoni[6] 1971 Perez and Vidal[7] 1994 Chen-Ventura-Wu[8] 1996 Heckbert and Garland et al.[9] 1997 Tsen et al.[10] 1998 Mori et al.[11] 1999 Salloti[12] 2001 Kolesnikov[13,14,15] 2003 Cuadro 1. programación dinámica Algoritmos Clásicos Enfoque Presentado por Fecha Detección de puntos dominantes Attneave [16] 1954 Algoritmo secuencial Sklansky & Gonzalez[17] 1972 División Douglas-Peucker[18] 1973 División y unión Pavlidis and Horovitz[19] 1974 Relaxing Labelling Davis and Rosenfeld[20] 1977 División Hershberger and Snoeyink [21] 1992 Unión Pikaz and Dinstein[22] 1995 Cuadro 2. Enfoque heurístico Algoritmos de optimización Enfoque Presentado por Fecha K-Means Philips and Rosenfeld[23] 1988 Vertex Adjustment Chen-Ventura-Wu[8] 1996 Búsqueda Tabú Glover and Laguna[24] 1997 Algoritmos Genéticos Yin[25] 1998 Algoritmo de las hormigas Vallone[26] 2002 Algoritmo de las hormigas Yin[27] 2003 Cuadro 3. Enfoque heurístico
10 dos veces en los proyectos de fin de carrera de David Reboiro Jato y Pedro Silva Calveiro. Las estructuras geométricas como el diagrama de Voronoi y la triangulación de Delaunay, que hemos visto más atrás, son cruciales para la aplicación de los algoritmos. El cómputo que participa en la construcción de dichas estructuras debe ser tenidos en cuenta. Así, se estudiaron varias bibliotecas (probadas, robustas y eficientes) y se revisan sus principales características, ya sea en los tipos de estructuras de datos implementado como los algoritmos geométricos principales necesarios para las operaciones básicas.
11 1. Introduction This Master s thesis has been realized in the research group LIA (Laboratorio de Informática Aplicada) of the Computer Science Department of the University of Vigo and comprises a further step in the ongoing work of Automatic Shape Recognition. 1.1. A detailed look at one session One ShapeWIZ session is composed of three basic steps. In the first step the system reads the point set, in the second one, the user chooses the metrics and the algorithms for computing and in the third one, the system shows the results. For the sake of a better understanding how ShapeWIZ is intended to work, the author provides the following Fig. 10, Fig. 11 and Fig. 12 where the basic steps are illustrated. Fig. 10. The first step (Reading the point set) 1.2. Applications The resolution of the problems proposed can be applied in many areas such as: Computer vision. Image segmentation. Object identification in an image. Pattern recognition. Geographical Information Systems. Shape identification in children and disabled people self learning.
12 Fig. 11. The second step (Choose proper metrics and algorithms) Fig. 12. The third step (Show the results) 1.3. Objectives The main goal of this work is to describe an approach to automatic shape recognition that by using the proper algorithms and their suitable parameters can recognize the shapes of the given sample point set, where each individual shape relates to a distinguishable element. The process includes four important subtasks, namely, clustering, ordering, simplification and adaptation. Four these four tasks extensive literature has been reviewed. There are some works already done in these areas. For instance, Shaprox [28], Sharec [29] and Gathan [30] are software schemes that will be deeply studied. Dealing with intersection and overlapping shapes is an interesting but a huge challenge and we will not go so deep this moment.
13 1.4. How to read this Master Thesis In the Section 2 the reader is introduced to some basic mathematic principles and geometric structures. Point representation in the plane, convex hulls, Voronoi diagrams, Delaunay triangulations and alpha shapes and other geometric structures are presented. Section 3 gives the theoretical relevance of this work. Hence, the work may be divided into smaller works, each of them is discussed and the relevant research in their areas is presented. Section 4 describes with detail the ShapeWIZ recognize shape system developed. First the author states an overview of what the main problem is. Then for each step of the flowchart, the algorithm or algorithms used are described in detail. Section 5 summarizes all the work, considers several alternative approaches and give some directives for further work in this area.
14 2. Mathematical Issues In this section we give some basic notions, definitions and some mathematical entity properties used in two-dimensional geometric representation, following descriptions as in [31,32]. Dealing with geometric entities like polygons, we can treat them as onedimensional or two-dimensional objects, depending on the viewpoint of the approach. For instance, a triangle, a one-dimensional object, is just the closed polygon perimeter (i.e., the set of line segments that connects, sequencially and orderly, its vertices), whereas as two-dimensional object a triangle refers both to its perimeter and the region that it bounds. The same way, a circle as onedimensional object is referred to the curve (the circumference) and the disk is the curve and the region it bounds, so the disk is a two-dimensional object, however, when referring to a disk one often uses the word circle as well. In the sequel we describe some shapes (geometric primitives in 2D), distances, geometric data structures, and filter algorithms to extract shapes with certain properties. 2.1. Geometric Primitives in 2D Coordinates Systems and Point Representation 2D Cartesian coordinates system We may define a 2D Cartesian coordinate system as a pair of perpendicular lines, which cross in a point that we call origin. We, usually, refer to the horizontal line as x-axis (or abcissa) and to the vertical line as y-axis (or ordinate). In a 2D Cartesian coordinate system, points are represented, uniquely, by a pair of two values (the x-coordinate and the y-coordinate), which are usually the signed and perpendicular distances to the coordinate axis. We show below four points P 0, P 1, P 2, and P 3 in a 2D graph representation (see Fig. 13). Polar coordinates system A polar coordinate system is composed by a fixed point (which we call the pole) and a ray (that we call polar axis) with a fixed direction, usually, drawn horizontally pointing to the right. In a polar coordinate system, points are represented, also, by a pair of two values: the distance (that we call the radial coordinate, usually denoted by r) from the pole and the angle (that we call the angular coordinate, usually denoted by θ) from the polar axis. See in Fig. 14 two points represented in the graph. The value of θ is positive if it is measured counterclockwise and negative if it is measured clockwise.
15 Fig. 13. Points represented in 2D Cartesian Coordinates system Fig. 14. Points represented in Polar Coordinates system This representation is not unique, because if we add any number of turns (360 ), the angular coordinate never changes and because of that a point may be represented by an infinite number of polar coordinates. Convertion between coordinate systems Assuming that the pole is coincident to the origin, polar axis is the Cartesian x-axis and y-axis has its angular coordinate equals to 90, by using trignometric functions we can convert polar to Cartesian coordinates (see Eqn. (1) and Fig. 15 for their relationship): x = r cos θ y = r sin θ (1) and convert cartesian to polar coordinates (see Eqn. (2)):
16 r = x 2 + y 2 0 if x = 0, θ = arcsin( y r ) if x 0, (2) arcsin ( y r )+π if x 0. Fig. 15. Points represented in polar coordinates system Linear Components Here we refer to lines, line segments and rays as linear components. They may be represented either implicitly or parametrically. A line may be defined in many ways. Given two points, the line is the straight line with no endpoints which passes through those points. Another definition for a line is the straight line which passes through a point with a referred direction. In both cases it has no endpoints (see Fig. 16). It can be represented in many ways, by linear equations and linear functions. Linear equation (3): L = {(x, y) ax + by = c}. (3) where a, b and c are the real coefficients, with a and b not being both zero. The line defined by x-intercept a and y-intercept b is the form (see Fig. 17 and Eqn. (4)): The characteristic form is (Eqn. 5): x a + y =1. (4) b y = mx + b (5)
17 Fig. 16. Lines representation Fig. 17. The line intercepts a and b Fig. 18. The slope m = y x
18 where m is the slope, c is the y-intercept value and x is the independent variable of the function y. The slope is given by (see Fig. 18 and Eqn. (6)): m = y 2 y 1 x 2 x 1. (6) Another representation is one point (x 1,y 1 ) and the slope m (see Eqn. (7)). y y 1 = m(x x 1 ) (7) Two points (x 1,y 1 ) and (x 2,y 2 ) representation is given by Eqn.(8). y y 1 = (y 2 y 1 ) (x 2 x 1 ) (x x 1) (8) Finally the parametric form as given in Eqn. (9): x = x 0 + at y = y 0 + bt. (9) Where x and y are functions of the independent variable t; x 0 and y 0 are initial values (or ((x 0,y 0 )) is any point on the line); a and b are related to the slope of the line. The vector (a, b) is parallel to the line. A line segment may be defined by two points. Given two distinct points P and Q a line segment is the straight line which connects those points, which are its endpoints, and is denoted by PQ. A ray is a half line. It has one endpoint in one direction, but no endpoint in the other. It can also be defined by two points. One is the initial point and the other is the one which it is passing through. A polyline, also called polygonal chain, polygonal curve or piecewise linear curve, is composed by a series of line segments, that we can formally describe as follows: given a point set P = {p 1,p 2,,p n } in the plane, that we call the vertices, the polyline consists of the line segments connecting the consecutive vertices. When the initial vertex p 1 is distinct from de final vertex p n we call it as opened polyline, otherwise if p 1 = p n the curve is called closed polyline. We often refer to this closed polyline as polygon. A triangle is a polygon determined by three non-collinear points P 0, P 1 and P 2. If P 0 is the origin, we have two edge vectors: e 0 = P 1 P 0 and e 1 = P 2 P 0. Each point is called a vertex. The order is important in most applications (see Fig. 19). Assuming P 0 =(x 0,y 0 ), P 1 =(x 1,y 1 ) and P 2 =(x 2,y 2 ) one can compute the determinant δ Eqn. (10). 1 1 1 δ = det x 0 x 1 x 2 (10) y 0 y 1 y 2
19 (a) (b) Fig. 19. Triangle Ordering. (a) counterclockwise and (b) clockwise. If δ< 0 then the orientation is clockwise. If δ> 0 then the orientation is counterclockwise. If δ = 0 the three points are collinear. A circle is the set of points in the plane that are equidistant from a given point O. The distance r is called the radius and the point O is called the center (see Fig. 20). The diameter is twice the radius. The perimeter is called as well circumference. The implicit equation of a circle centered on (x 0,y 0 ) and radius r is given in Eqn. (11): And the parametric equation (12): (x x 0 ) 2 +(y y 0 ) 2 = a 2 (11) or Eqn. (13) x = a cos t y = a sin t (12) x = 1 t2 1+t 2 y = 2t 1+t 2 (13) 2.2. Distance in 2D To solve problems in Computational Geometry, it is crucial to know some basic information, such as, given two arbitrary points, how far it is one point from
20 (a) Implicit form (b) Parametric form Fig. 20. Circles. another; what is the distance between a given polygon vertex and an arbitrary point in the plane; if a given point is inside or outside a circle or a triangle; if one point is on the left or on the right of a line segment or if it just lies on the line segment itself, and so on. Well, for solving this kind of problems, we use distance metrics. Some of them are see listed below. Metrics L 1 or Manhattan distance (see Fig. 21 (a) and Eqn. (14)) L 1 (v, w) = v x w x + v y w y (14) L 2 or Euclidean distance (see Fig. 21 (b) and Eqn. (15)) L 2 (v, w) = (v x w x ) 2 +(v y w y ) 2 (15) L or Maximum distance (see Eqn. (16)) Vertical distance (see Fig. 21 (c) and (d)) L (v, w) = max i=1 k { v i w i } (16) Perhaps, the metric mostly used is the Euclidean distance. Point to point Given two points x 0,y 0 and x 1,y 1 the distance d between each other is expressed by Eqn. (17): d = (x 1 x 0 ) 2 +(y 1 y 0 ) 2 (17)
21 (a) Euclidean distance (b) Manhattan distance (c) Vertical distance to a line segment (d) Vertical distance to a line support Fig. 21. Examples of distances Very often it could be more suitable that one uses the squared distance instead distance Eqn. (18): d 2 =(x 1 x 0 ) 2 +(y 1 y 0 ) 2 (18) With these metrics it is easy to calculate the following distances: Point to Line Point to Ray Point to Line Segment Point to PolyLine Point to Polygon Point to Triangle Point to Rectangle Point to Convex Polygon 2.3. Geometric Data Structures Besides, geometric primitives usage, when developing algorithms for solving geometric problems other geometric data strutures are used.
22 Convex Hull Given a point set S, we may define the Convex Hull CH(S) as the smallest convex set that includes all the points of S [33]. We can also say that S is a convex set whenever two arbitrary points P and Q are inside S and the whole line segment PQ is also in S (see Fig. 22 and Fig. 23). (a) Convextity (b) NonConvexity Fig. 22. Convextity and NonConvexity. Fig. 23. The Convex Hull of the point set. There are several algorithms developed to compute this structure, we can see in Table 4 an ordered list by date:
23 Algorithm Speed Presented by Date Gift Wrapping O(nh) Chand & Kapur[34] 1970 Graham Scan O(n log n) Graham[35] 1972 Jarvis March O(nh) Jarvis[36] 1972 QuickHull O(nh) Eddy[37] 1977 O(nh) Bikat[38] 1978 Divide & Conquer O(n log n) Preparata & Hong[39] 1977 Monotone Chain O(nh) Andrew[40] 1979 Incremental O(n log n) Kallay[41] 1984 Table 4. Convex Hull Algorithms, where n is the number of points and h is the number of vertices. Voronoi diagram Consider the Euclidean distance between two points p and q by: dist(p, q) = (p x q y ) 2 +(p y q x ) 2 (19) the Voronoi Diagram is a geometric structure, presented in 1907 by Georgy Voronoi [42], and we can define it as follows: Let P = {p 1,p 2,..., p n } be a set of n distinct points in the plane; these points are the sites in the diagram. Then, we define the Voronoi Diagram of P as a subdivision of the plane into cells, one for each site in P, with the property that a point q lies in the cell corresponding to a site in p i iff dis(q, p i ) < dist(q, p j ) for each p j P, with i j. We denote the Voronoi Diagram of P by V or(p ). Studying a single Voronoi cell we can see that for two points p and q in the plane we define a bisector of p and q as the perpendicular bisector of the line segment pq. This bisector splits the plane into two half-planes. We consider the open half-plane that contains p as h(p, q) and the open half-plane that contains q as h(q, p). We may also notice that a point r h(p, q) iff dis(r, p) < dist(r, q). From this we can state the following: V (p i )= h(p i,q i ). (20) 1 j n,j i Thus, V (p i ) is the intersection of n 1 half-planes and a open convex polygonal region bounded (possibly unbounded) at most n 1 vertices and at most n 1 edges. If all the points are collinear then the V or(p ) consists of n 1 parallel lines, otherwise V or(p ) is connected and its edges are either line segments or half-lines (rays). We may realize these properties: a point q is a vertex of V or(p ) iff its largest empty circle C p (q) contains three or more sites on its boundary; the bisector between sites p i and p j defines an edge of V or(p ) iff there is a point q on the bisector such that C p (q) contains both p i and p j on its boundary but no other site at all
24 (a) point q and the empty circle (b) Point q on the voronoi edge Fig. 24. Voronoi circumcircles. The following Fig. 24 shows this idea. During the last forty years, some researchers have developed algorithms for computing the Voronoi diagram. Below we refer to two of them: Bowyer Watson algorithm[43] and [44]: this is a method, in computational geometry, for computing the Voronoi Diagram of a finite set of points in any number of dimensions. The algorithm is incremental working by adding points one at a time to a correct Voronoi Diagram of a subset of the desired points. This algorithm is sometimes known as the Bowyer Algorithm or the Watson algorithm. Fortune s algorithm [45]: The most utilized is the Fortune s algorithm. It is a plane sweep algorithm for generating a Voronoi Diagram from a set of points in the plane, using O(n log n) time and O(n) space complexity. There are some implementations in the Internet that one can visit and test (see [46,47]). Based on this structure it is possible construct many types of Voronoi Diagrams. In the following we refer to some of them: Closest point Voronoi diagram The Closest point Voronoi diagram of a given point set S denoted by VD c (S) is defined as the set of all regions that cover the plane. It describes the areas that are nearest to a set of arbitrary given points. (see Fig. 25) such that (21), V p = {x d(p, x) d(q, x)},p q, and p, q S (21)
25 Fig. 25. Closet Point Voronoi Diagram Closest point Voronoi neighbor We can say that two points p and q of S are Closest point Voronoi neighbors if the regions V p and V q share a common point. Furthest point Voronoi diagram Similarity, though it is the opposite of the previous diagram, the Farthest point Voronoi diagram of a given point set S denoted by VD f (S) is defined as the set of all regions that cover the plane. It identifies the areas which have the greatest distance from the given points (see Fig. 26) such that Eqn. (22), Fig. 26. Furthest Point Voronoi Diagram. See the points in red and the furthest areas related in black. W p = {x d(p, x) d(q, x)},p q, and p, q S (22) Furthest point Voronoi neighbor We can say that two points p and q of S are Furthest point Voronoi neighbors if the regions W p and W q share a common point.
26 Delaunay Triangulation A Delaunay Triangulation for a set P of points in the plane is a triangulation DT(P ) such that no point in P is inside a circumcircle of any triangle in DT(P ). A Delaunay Triangulation maximizes the minimum angles of all the triangles in the triangulation; they tend to avoid skinny triangles (with sharp angles). This triangulation was presented in 1934 by [48]. By its definition, a circumcircle of a triangle formed by three points from the original set of points is empty, it does not contain vertices other than the three that define it (no other points are allowed inside, only are permitted on the very perimeter). The original definition for two dimensional spaces is that a triangle net is a Delaunay Triangulation iff all the circumcircles of all the triangles in the net are empty. For a set of points on the same line (collinear points) there is no Delaunay Triangulation. In fact, because of that, the notion of triangulation is undefined. For four points in the same circle (e.g. the four vertices of a quadrangle) the Delaunay Triangulation is not unique. There two possible triangulations of the quadrangle that satisfies the Delaunay condition. Fig. 27 shows a Delaunay Triangulation and the circumcircles. Fig. 27. The Delaunay triangulation in the plane and the circumcircles shown. This structure has a very true relationship with the Voronoi Diagram. In fact, it corresponds to the dual graph of the Voronoi Diagram. This idea is stated here Fig. 28 and Fig. 29. In 2D, as we saw in Voronoi diagrams, we can also construct two types of Delaunay Triangulations (see Fig. 30). Closest Point Delaunay Triangulation and Furthest Point Delaunay Triangulation
27 Delaunay circumcircles cen- Fig. 28. ters Fig. 29. A Voronoi diagram and its dual Delaunay Triangulation (a) Closest Point Delaunay Triangulation (b) Furthest Point Delaunay Triangulation Fig. 30. Delaunay Tringulations. We can say that the closest point Delaunay Triangulation of S, DT c (S) is the dual straight line graph, where the points p and q are connected iff they are closest Voronoi neighbors. Following, the furthest point Delaunay Triangulation of S, DT f (S) will be the dual straight line graph, where the points p and q are connected iff they are furthest Voronoi neighbors. Gabriel Graph In the Euclidean plane the Gabriel Graph of a set S of n points expresses some notion of proximity of those points (see Fig. 31). Formally we can say, that given a set of points, two points a and b are Gabriel neighbors, that is, adjacent vertices, iff they are distinct and the closed disk of
28 Fig. 31. Gabriel graph of 100 random points which the line segment ab is the diameter contains no other point of S (see Fig. 32 and Fig. 33). Fig. 32. Points A and B are Gabriel neighbors because C is outside their diameter circle Fig. 33. The presence of point C within the circle prevents points A and B from being Gabriel neighbors. This graph has his name after K.R. Gabriel [49] who introduced it in a paper with R.R. Sokal in 1969. This structure has some interesting properties: it contains a subgraph of the Euclidean Minimum Spanning Tree; it contains the Nearest Neighbor Graph (see Section 2.3), and it is an instance of the β Skeleton (see Section 2.4). If the Delaunay Triangulation is given, there is a implementation of an linear time algorithm for generating the Gabriel Graph [50].
29 Neighborhood Graphs In Computational Geometry, neighborhood graphs are graphs that allow us to have some notion of proximity of the n points (each one from the others) of a given a set S of points in the plane according to a metric. In the sequel we ll refer to some variants, like Nearest Neighborhood Graph, K-Nearest Neighborhood Graph and Relative Neighborhood Graph. We can use for computing the distances any of the metrics that we have previous described (in the examples we use the Euclidean distance). Nearest Neighborhood Graph We can define this structure as follows: given a poinset P in the plane, the points p and q are connected and q is nearest neighbor of p iff there is no other point r which is closer to p. Formally, we can say that q is nearest neighbor of p iff p q min r p p r [51,52]. It can be a directed or undirected graph. K-Nearest Neighborhood Graph We can define this structure as follows: given a point set P of n points in the plane and also in a metric space the graph is composed by the vertices of the point set connected to each other iff they are neighbors. Two points p and q k-neighbors and connected by an edge iff the distance from p to q is among the k-th smallest distance from p to other points of the point set P. The previous graph seen is a special case of K-NN graphs. We may refer to a 1-NNG. Another special case is the (n 1)-NNG, also called FNG (Farthest Neighborhood Graph). Relative Neighborhood Graph This structure is an undirected graph defined in the Euclidean plane. We can obtain it by connecting two points p and q by an edge whenever there does not exist a third point r that is closer to both p and q than they are to each other. This graph was first proposed in 1980 by Godfried Toussaint [53] as a way of defining a structure from a set of points that would match human perceptions of the shape of the set of points (see Fig. 34). There are several algorithms developed for computing and generating this graph: Supowit in 1983 [54] showed how to generate the graph efficiently in O(n log n) time. Katajainen et al. [55] created an algorithm that builds the graph in O(n) for a random set of points distributed uniformly in the unit square. If the Delaunay Triangulation of the point set is given [56,57] the graph can be computed in linear time. Properties:
30 Fig. 34. Relative Neighborhood graph it is an example of a lens-based β Skeleton, it is a subgraph of the Delaunay Triangulation, and the Euclidean Minimum Spanning Tree is a subgraph of it. 2.4. Filter algorithms to extract shapes What is Geometric Clustering? In Computational geometry, we can define Geometric Clustering as the computational process, which implies the use of algorithms that allow to separate and to group the points from a given point set in the plane, with a criterion (or condition) previously defined (e.g, the geometric distance between points), in order to construct its shape or the construction of the shapes of its components (if there are more than one). Unless the sampling is sufficiently dense and uniform, a criterion is mandatory because without anyone, no curve can be reconstructed, e.g, given a point set in the plane, if no condition is applied [58], the output curve can be anyone, which can connect whatever points in all arbitrary ways. What is Curve Reconstruction? The same way, Curve Reconstruction is the process that connects the points, sequencially, one to each other by straight line segments, without intersections, respecting one or more conditions and provides to humans an intuitive notion of the shape. In the following we describe some filter algorithms based on Delaunay Triangulations that reconstruct shapes by eliminating some edges under some criteria: Alpha-Shapes, β-skeleton, Crust, Gathan old and Gathan Guaranteed.
31 Alpha Shape In 1981, an efficient algorithm for curve reconstruction was presented to the scientific community [59]. This algorithm is close related with Delaunay triangulations. It is able to separate the given point set into point subsets and to reconstruct the curves of its components under some conditions. With this article the notion of α-shapes was brought up, where, according to the definition of the authors, they are a family of planar graphs, not necessarily connected, because they may represent a forest instead of only a unique tree (if there is more than one connected component at the ending process). So, we can define this structure as follows: given a point set in the plane and an arbitrary real number α, the graph generated by the algorithm is the one whose vertices are α-extreme and edges connect each vertex to its respective α-neighbors. In order to better understand this process some additional notions should be added. Generalized Disk: If α> 0, we consider a generalized disk as a disk (circle) of radius 1/α. Complement of a Disk: If α< 0, we consider the complement of a disk of radius 1/α. α-extreme: If α< 0, one point p are said α-extreme if it lies on the boundary of generalized disk that contains no other points of the point set. If α> 0, one point p are said α-extreme if it lies on the boundary of generalized disk that contains all the other points of the point set. α-neighbor: For α< 0, two points p and q are called α-neighbors if they lie on the boundary of same generalized disk that contains no other points of the point set. Once the value of the parameter α is fixed, e.g., α = 0, α< 0 or α> 0, the graph is generated accordingly. Below are shown the graphs generated with negative and positive α values (see Figs. 35). In two-dimensional space the algorithm has execution time O(n). There are some implementations available for non commercial use. There is a Java Applet in [60], which is available for testing interactively different values for α that shows the behaviour of the algorithm by adding points on the screen and selecting different views. This algorithm works in O(n log n) time. Over the last almost thirty years, this initial discovery has been inspired several researchers and many works and articles have been proposed, among others the extension to higher dimensions (see [61]). β-skeleton Aβ-Skeleton is an algorithm created in 1985 by [62], as a variant of the previous α-shapes algorithm [59]. This algorithm generates an undirected graph, which we may defines as follows: Given the point set S in the Euclidean plane, two points p S and q S are connected by an edge iff the angle prq, with
32 (a) α-neighbors connected, with α< 0 (b) α-neighbors connected,with α> 0 Fig. 35. Alpha-Shapes with different α values. r S is sharper than the angle Θ being controlled by the numerical parameter β, (considered as a threshold). For better understanding the algorithm, let us consider the following: β value. It is a real positive value defined as the threshold, Which controls the angle Θ. The angle Θ Can we calculate this angle by Using the formulas 23: θ = { sin 1 1 β, if β 1, π sin 1 β, if β 1. (23) Empty or Forbidden region Let us consider, given the point set S, p, q S, the region R pq, called empty or forbidden region, is the set of points which is the angle prq is greater than Θ. For β> 1, R pq is the union of the two open disks with diameter βd(p, q) and Θ π 2 (see Fig. 36 (a)). For β< 1, R pq is the intersection of the two open disks with diameter d(p,q) β and Θ π 2 (see Fig. 36 (b)). When β = 1, R pq is the single open disk with pq as its diameter. The two formulas give the same value Θ = π 2 (see Fig. 36 (c)). We can also say that the β-skeleton of a point set in the plane is the undirected graph which connects two vertices p and q with an edge pq iff R pq contains no other points of S. If there exists a point r for which the angle prq is greater than Θ, pq is not an edge of the graph. The radius r of the disks is related to the β value. [58] achieves for the optimal reconstruction of the r-sampled smooth curve the value r<= 0.279. Comparing with the previous algorithm (the crust), this algorithm requires less minimum sample density for the reconstruction of a curve. The algorithm runs in O(n log n) time.
33 (a) a (b) b (c) c Fig. 36. The empty regions (a) union of two disks, (b) intersection of two disks and (c) the single disk
34 Crust In Curve Reconstruction, the previous algorithms required a sufficient density and uniformness of the sampling conditions for the point set. As opposed, Crust is an algorithm presented by [58], which under some appropriate sampling conditions guarantees the reconstruction of a smooth curve represented by the point set in the plane, without intersections (se Fig. 37). The sample density could vary over the curve, increasing in areas of high curvature and decreasing in flat areas. We can describe this algorithm in the following way: given a set S of n points in the plane and V the vertices of the Voronoi diagram V or(s) of S. Let us consider S = S V as the closest point Delaunay Triangulation DT(S ). An edge e DT(S ) belongs to the Crust(S) iff its endpoints belong to S or, in an another way, if there exists an empty circle which touches only the endpoints (and no other points exists inside this circle). This notion is related to the Medial Axis. Some notions used are new and to understand the algorithm they must be shown. Sample Density Condition: We may say that the smooth curve F is r- sampled by a set of points S if every point p F is within the distance r of a sample s S to the medial axis. The authors consider values of r 1. Medial Axis: In 2D, the Medial Axis of a curve F is the closure of the set of y points y R 2, which has two or more closest points on that curve. Local Feature Size: Local Feature Size is a function, which in some sense defines the level of detail at an arbitrary point on a smooth curve F. Formally, we may define it as the Euclidean distance from a point p F to the closest point m on the medial axis. This algorithm works in O(n log n) time. Compared with previous algorithms the crust proposal has brought to the Computational Geometry new solutions in Curve Reconstruction, however, it has two major drawbacks. One is the difficulty of dealing with non smooth curves, that is, curves where sharp corners are present. The other problem is that they are unable to cope with open curves and/or collection of open curves (see Fig. 38). Gathan old The last algorithm to approach is Gathan. This algorithm was developed by Tamal Dey and Rafael Wenger has two versions.one presented in 2001 [63], which we call gathan old and the other presented in 2002 [64] called gathan Guaranteed that we refer next. Previous algorithms like [58] and [65] have some drawbacks. (see Fig. 38).
35 (a) The Voronoi diagram of the point set (b) Curve reconstruction from the Delaunay triangulation Fig. 37. The Crust Algorithm. (a) Crust (b) Gathan Fig. 38. Comparing algorithms (Crust and Gathan) As we can realize Crust has problems handling open curve and collection of curves, while Gathan has no problems. This proposal works under certain conditions that we will refer to next and is intended to yield good results. This proposal seems to solve some of problems that were left behind by the referred previous algorithms. It improves their performance, being more precise in the Geometric Clustering and Curve Reconstruction, by following some strategy from [65] in its process, however, adding new notions and conditions solving the sharp corners problem. To explain how the algorithm works we need to add some extra definitions. It uses the concepts of pole and pole direction recalled from [66]. Pole A pole of a sample point p the farthest Voronoi vertex from p in its Voronoi cell. If the Voronoi cell is not bounded, the pole is one the points in the Voronoi cell at infinity. Pole direction (or estimate normal line)
36 It is the direction from p to the pole. In the case of Voronoi cell unbounded the pole direction is the average of the two unbounded rays. Angle Condition An edge is chosen for the NN test if its dual Voronoi edge makes an angle less than a user defined parameter α with the estimate normal at p. It was observed that a value between 35 and 40 would be a good choice for the mostly cases for curve reconstruction. Ratio Condition This condition compares the length of a Delaunay edge (p 1,p 2 ) to the distance from p 1 or (p 2 ) to the endpoints of the Voronoi edge. It was also verified that a value in the range 1.7 2.0 works best in most cases. This threshold condition is used to allow the curve reconstruction of the edges which cut across a corner. Topological Condition The NN algorithm choose at most two edges per sample point. Nevertheless, in the filtering process some sample points may acquire more than two edges. Assuming that we are reconstructing a curve without branchings, we shoul delete the extra edges. In general, the algorithm keeps only the two smallest and delete the others. This criterion requires that each edge has no more than two. We can explain the algorithm working as follows: Basicly, the algorithm has four steps. Starts by constructing the Voroinoi diagram and Delaunay triangulation of the sample point set. Then it verifies if certain edges obtained by NN filter are part of the reconstruction. Next it extends the reconstruction along adjacent points which poles point in the same direction. After that adds the missing edges adjacent to the corners. The last step adds any of the edges missed by the previous three steps, namely, edges in smooth portions of the curve where the pole direction flips from one side to the other. This algorithm works in O(n log n) time. Gathan Guaranteed The Gathan Guaranteed algorithm presented in (see [64]) is an improvement of the previous referred. In that paper, the authors showed, more or less, the same philosophy but some refinements in the four steps are made. With those changes, the authors guarantee a reconstruction of a family of closed curves under the appropriate sampling conditions. However, they do not guarantee a correct reconstruction for open curves. As so far this algorithm is the best for curve reconstruction given a point set in the plane.
37 3. The ShapeWIZ approach 3.1. Principal scheme Referring to the introduction and in order to achieve the objectives, we consider the following approach (see Fig. 39). The main important steps are #2: Classify the point set (The Clustering), #3: Order the points of each point set (Ordering), #4: Simplify each set (Simplification or Approximation), and #6: Adapt (Adaptation) that we describe in the sequel. This work, as we have previously mentioned, is an extension of the works from David [28] and from Pedro [29] trying to solve problems that were remained open. Our main goal is by using their works and what is needed to implement additional features to reach our initial proposal: automatic shapes recognition. In our case, still we are only focused on recognizing simple shapes, which are shapes with no interceptions. The main idea is as follows: given a read point set, the scheme is able to recognize if the point set is a line segment, a circle, a polygon with 3 up to 7 vertices; if it is a regular polygon or not. At this moment, and because the time was not enough, we had no intention in developing an entire scheme. So the scheme is composed by parts. Each one allows interaction among the others in other to achieve the final goal. In each stage of the global process there is an input and an output text file. In the first stage (Reading), the input can be also the left button mouse clicking. In this stage, we use the GathanViewerExtended scheme for reading the data. That could be done either by loading an input point listing text file or by the left button mouse clicking (adding the points one by one). Next stage (Clustering) we must choose the algorithm from a list option. We can choose five algorithms: Crust, Nearest Neighbor, Conservative Crust, Gathan and Gathan Guaranted. We are able to modify the parameters if wished. Next, since we have the data stored (Voronoi points and Edge List) we should apply a process to find connected component to discover if there is more then one shape in the point set. After the process ends we should save the data into an edge list and a point list text files for later usage. The next phase is Simplification. Once we have all data needed, we should close GathanViewerExtended and open Sharec Extended. After running the application we should load the files previously referred. The simplification process will be applied to each one the components loaded. After that, each component (shape related) is submitted to a recognition process (if the shape is a line segment, a circle or a polygon of three up to seven vertices). Then the shape is adapted (Adapting) using the Adapter (an implementation using the Garcia Palomares algorithm). The function value returned is stored and the process returns to the recognition phase and again tries an adaptation. As soon as all the process ends the minimum function value give us the most probable shape (or shapes) of the initial point set. Then the scheme show the user the results found.
38 Fig. 39. ShapeWIZ Flowchart 3.2. Clustering The filter algorithms seen in Section 2.4 are already some kind of clustering, because their sampling conditions allow to separate the point set into components. In general, Clustering is considered an important unsupervised learning problem. Its goal is finding a structure in a collection of unlabelled data. We can also say that Clustering is a process of organizing objects into groups (called clusters), where its members are similar to each other in some way (respecting some conditions). To keep the idea see Fig. 40 below.
39 Fig. 40. Distance based clustering As we can realize the four clusters (right picture) are divided by a similarity criterion (the distance to each other). We may say that two or more objects belong to the same cluster if they are close enough according with the criterion defined. In this case the criterion is the geometric distance. We can use several critera for grouping clusters. In this particular case we use a well-defined and enumerable criterion, but it could happen that we have to define a more general concept, organized by the characteristics that should be present in the cluster s members in order they could cluster. Clustering types may be classified as listed: Exclusive Clustering Overlapping Clustering Hierarchical Clustering Probabilistic Clustering In the first case data are grouped in a single way, e.g., if a member belongs to a cluster it can not be included into another. On the contrary, the second case uses fuzzy sets to cluster data. In this case each point may belong to a cluster and to another at the same time, though with different degrees of membership. The third case uses algorithms based in a technique as follows: at first, each datum (or point) is itself a cluster. So, we have in the beginning so many clusters as we have points (data instances). Next, after some algorithm s iterations the points are grouped, respecting the criteria previously definided. Finally the process ends and the final number of clusters is reached. The last case is charaterized by the use of Probabilistic Approach. An important component in Clustering Algorithms is the distance, its concept and definition. Usually, L 1,L 2 and Minkowsky metrics are used. Metrics for Clustering:
40 Minkowsky Metric for high-dimensional data: d p (x i,x j )= ( d k=1 x i,k x j,k p ) 1 p (24) L 1 Metric and L 2 Metric are special cases where p = 1 and p = 2 respectively. Often it is difficult to compare data, because the features of its components are not continuous and we can not use a metric like the above, as in case with nominal categories like the months of the year, the days of the week or the color, but it is not the case we are approach next, where we have distances as metrics. 3.3. Ordering What is Ordering? In Computational Geometry, Ordering is the process of findind the shortest path in a connected given point set, according to some ordering criteria. It means, that all the vertices are connected by a line segment starting from the first, going to the second, keep on going, until the last one [33]. The filter algorithms seen in Section 2.4 are already some kind of ordering, because their sampling conditions besides allowing to separate the point set into components and also provide their ordering. In the case of opened curves, the first point and the last points must be distinct vertices, unlike, in the closed curves where the first one and the last one must be coincident vertices (see Fig. 41). Ordering (or Sorting) is also a main problem in Graph Theory. We should add some considerations to. A graph G =(V, E) is a struture composed by a set of vertices V and a set of edges E, which connects some (or all) pairs of vertices of G. A tree is a struture where any two vertices are connected by one single edge, that is, cycles are not allowed. A Spanning Tree is a tree compounded by all the vertices and some (in some cases all of the ) edges of G (see Fig. 42) and it is not unique, because a graph can have more than one spaning tree. Minimum spanning tree We can assign a weight to each edge. In Computational Geometry, this weight is, usually, the geometric distance between the points, and we use this to assign a weight to a spanning tree by computing the sum of the weights of the edges in that spanning tree. So, a Minimum Spanning Tree (MST) is a spanning tree with weight less than or equal to the weight of every other spanning tree (see Fig. 43). These structures have important properties, which we can use for solving some of the hard problems. They can be computed quickly and easily, creating a
41 Fig. 41. The red line is the shortest path between P 1 and P n, with P 1 = P n. Fig. 42. A Spanning Tree
42 Fig. 43. The minimum spanning tree sparse graph, which reflects many of the original features. By deleting the longer edges, they allow to identify groups in the point set. Because of that, they are often used to approximate solutions to classical hard problems, like, for instance, TSP or Steiner tree. There are three classical algorithms to create MSTs (see Table 5). MST Algorithms Presented by Speed Date Boruvka [67] O(N log N) 1926 Prim [68] O(N log N) 1956 Kruskal [69] O(N log N) 1957 Table 5. MST Algorithms Steiner tree A Steiner-tree T is a G subgraph (T G) represented by a minimum cost tree which connects all the vertices V of G (see Fig. 44). This structure differs from previous MST because it allows to be created new points (vertices) in the graph to reduce the cost of Steiner Tree. There are two types of Steiner Tree: The Euclidean Steiner Tree and rectilinear Steiner Tree. The first uses the metric L 2 to calculate distances between vertices. Each newly created vertex has degree 3 and the angles formed by their edges measure 120. The second uses the metric L 1, each newly created vertex has degree four
43 Fig. 44. Points representation and Steiner Tree respective and the angles formed by their edges are multiples of 90. In general, the Steiner tree problem is NP-hard and there is not an algorithm capable of solving the problem in polynomial time, however there are some heuristic approaches to solve the problem. Arora [70] has developed a polynomial time algorithm scheme. There is 1.55- approximate algorithm due to Robins and Zelikovski [71] and another approach from [72]. Traveler Salesman Problem In general, given a weighted graph, the Tsp problem consists in finding the cycle of minimum cost visiting all its vertices only once. In computational geometry, we can realize that (see Fig. 45 ). Connecting all the points of a point set we reconstruct the curve and we figure out the shape represented. Almost Tsp problems are NP-Complete type. The right procedure is using heuristics approaches which allow solving near optimal solutions. In curve reconstruction, Giesen [73] showed that for sufficiently dense sampling points in the plane, the Tsp tour will be a correct reconstruction of a piecewise smooth simple closed curve. In general, Tsp tour is difficult to compute but Althaus and Mehlhorn [74] showed how to compute it in polinomial time for curve reconstruction.
44 Fig. 45. Tsp problem representation. On the left is the point set and on the right the cycle. 3.4. Simplifying What is Simplification? In Computational Geometry, Simplification or Approximation is a problem that we can define as follows: given a curve represented by n points in the plane we want to obtain another curve (coarser) that is represented by m points with m < n and may give the notion of a similar curve. We may divide this Simplification problem into Piecewise Linear Approximation, when the curve is a opened one and Polygonal Approximation when the curve is closed. Formally, we can define, by one hand, an opened Polygonal Curve as follows: given a set P of n points in the plane P = {p 1,,p n }, with n>2 and p 1 p n. The points are connected sequencially from p 1 to p n by n 1 line segments; the simplified curve of P is a set Q of M + 1 points Q = {q 1,,q m+1 }, M line segments, Q P, M < N, and with p 1 = q 1 and p n = q m+1 because they share the endpoints. On the other hand, a closed curve is one that given a set P of n points in the plane P = {p 1,,p n }, with n>2 and p 1 = p n ; the simplified curve is defined as: given a set Q = {q 1,,q m+1 }, Q P, m < n, m line segments and q 1 = q m+1 or not because, according, with the approximation it can happen that they do not share previous endpoints (see Figs. 46). There is a classification for this kind of problems according to the parameters we use in the approach [75]. min # problems: when, given a polygonal curve with N line segments and an input number of M line segments, is intended to find one polygonal curve with the M line segments that minimizes the error (or tolerance) ɛ.
45 (a) Piecewise linear approximation (b) Polygonal approximation Fig. 46. Curve Approximation min ɛ problems: when, given a polygonal curve and a tolerance ɛ, we want to find another polygonal curve that can be represented by M (M < N) line segments that not exceed the tolerance ɛ. Over the last forty years there were many researchers involved in this subject. There are more than one hundred algorithms to solve simplification problems [76]. Error approximation metrics were used in the development of those algorithms as already described. All these distances were formally approached in the previous section, but we should redefine them according to [76]. To obtain a minimum or a maximum any distance metric can de used. The distance d k (i, j) from the curve shown to the approximation line segment (p i,p j ) may be computed as follows (see Fig. 47): d(k; i, j) = y k a i,j x k b i,j 1+a 2 i,j (25) where the coefficients a ij and b ij are computed from the parameters of the line segment (p i,p j ) a i,j = (y j y i ) (x j x i ) (26) b i,j = y i a i,j x i. (27) In the following we ll define the accumulated errors L p distance for a curve P, with P = {p i,p j } as the sum of all distances d(k; i, j) for all the vertices in the line segments: e p (i, j) = k=j 1 k=i+1 d p (k; i, j) (28)
46 Fig. 47. Distance d k from a point p k to the line segment (p i,p j). e p (i, j) = max {d(k; i, j)} (29) i<k<j E p (P )= (P ) = M e p (i, j) (30) m=1 max {e (i, j)} (31) 1 m M The min-# and min-ɛ problems can be solved by dynamic programming or graph theory based methods. Besides that, over the last thirty years, many heuristic algorithms have been proposed. In general, optimal algorithms are very slow, between O(N 2 ) and O(N 3 ) and heuristic algorithms are faster but with low optimality. That said, we realize that for large input data, the development of efficient algorithms is still an open problem. Dynamic Programming Approach Solving both min # and min ɛ problems, some researchers have been used Dynamic Programming [33], though, some of their works are, directly, focused only to one of the these approaches. Dynamic Programming has begun in 1961 with Stone s work [1] and Bellman s [2] and others went on the same way. Next we refer some of them (see Table 6) The first optimal algorithm was Perez and Vidal s [7], who have followed Bellman s idea. Their work has been inspired some of the following researchers, namely Kolesnikov. In fact, Perez and Vidal have proposed a solution for the min ɛ with L 2 error metrics in O(MN 2 ) in their algorithm. Kolesnikov et.al [13,14,15] introduced the modification of the state space and a scheme for storing the pre-calculated approximation errors. These improvements reduced the complexity of the algorithm to O(M(N M) 2 ). As far as we know this algorithm is the best on polygonal approximation.
47 Dynamic Programming Algorithms Presented by Date Stone[1] 1961 Bellman[2] 1961 Gluss[3] 1962 Lawson[4] 1964 Cox[5] 1971 Cantoni[6] 1971 Perez and Vidal[7] 1994 Chen-Ventura-Wu[8] 1996 Heckbert and Garland et al.[9] 1997 Tsen et al.[10] 1998 Mori et al.[11] 1999 Salloti[12] 2001 Kolesnikov[13,14,15] 2003 Table 6. Dynamic Programming Algorithms Heuristic Approach One can count several heuristic approaches (about a dozen) and more than one hundred of algorithms developed. Rosin [77] proposed two concepts for Assessing Polygonal Approximation Algorithms: Fidelity and Efficiency. According to his proposal most of the algorithms he tested had low fidelity and/or low efficiency values, which leaves very much work to do. One can divide the Heuristic Approach in two classes of the optimality point of view (see [76]): Classical algorithms, like sequencial algorithms, split, merge, split and merge, dominant points detection, relaxation labeling; Optimization algorithms: K-means, tabu search, genetic algorithms, ant colony optimization methods and vertex adjustment methods. We can see some of them listed in the following Table 7 and Table 8. Classical Algorithms Approach Method Presented by Date Dominant point detection Attneave [16] 1954 Sequential Algorithm Sklansky & Gonzalez[17] 1972 Split Douglas-Peucker[18] 1973 Split and Merge Pavlidis and Horovitz[19] 1974 Relaxing Labelling Davis and Rosenfeld[20] 1977 Split Hershberger and Snoeyink [21] 1992 Merge Pikaz and Dinstein[22] 1995 Table 7. Heuristic Algorithms approach
48 Optimization Algorithms Approach Method Presented by Date K-Means Philips and Rosenfeld[23] 1988 Vertex Adjustment Chen-Ventura-Wu[8] 1996 Tabu-Search Glover and Laguna[24] 1997 Genetic Algorithms Yin[25] 1998 Ant Colony optimization Vallone[26] 2002 Ant Colony optimization Yin[27] 2003 Table 8. Heuristic Algorithms approach Some heuristic algorithms can be used as part of optimal algorithms. For instance, One can use a heuristic algorithm, in a graph theory based algorithm, to reduce the complexity of the graph construction method. The fidelity of classical algorithms is usually low [77], because during all the process of approximation, the optimization error is not controlled. On the contrary, in the optimization algorithms the global approximation error is the main criterion to be controlled. This class of algorithms can provide near-optimal or optimal results, but the global optimality cannot be guaranteed even in the case of iteractive approaches. 3.5. Adaptation The adaptation algorithm used is implemented twice in the two previous final projects [28] and [29] and it is an implementation using the optimization algorithm [78] and it is not described further.
49 4. Implementation details All the structures we have seen in the previous sections are needed for the algorithms implementations. Structures like Voronoi Diagrams and Delaunay Triangulations are fundamental for the scheme development. Although the mathematical formulae in the developed algorithms for its construction are correct, we must take into account the computational calculus problem. Working with floating numbers is needed and it is advisable that we use proper geometric libraries and software packages, which were developed with precision. Thus we have searched and find some software that could be useful. Libraries and Software Packages Leda Its name is Library of Efficient Data types and Algorithms. It is a proprietarylicensed software library which provides C++ implementations for Computational Geometry and Graph Theory. Its development began in 1989 with Kurt Melhorn and Stefan Näher at Max Planck Institute for Informatics. Since 2001 it has been developed by Algorithmic Solutions Software GmbH. Leda is available under three editions: Free, Research and Professional editions. Its current version is 6.2.2. The Free edition is freeware and could be used, but its code is only available for purchase. The other editions are only available with payment. The Free LEDA contains several numerical data types needed for robust implementations. This library is currently available for several platforms: Linux, Solaris and MS Windows.NET. One can download Free LEDA from [79]. The other editions contain much more additional features including graph algorithms, cryptography algorithms, geometric algorithms, and much more, but its usage, as I said before, is only available for payment. CGAL Its name is Computational Geometry Algorithms Library. It is library developed in C++ language, though Python bindings are also available. Its development started in 1996 supported by a funding project of the European Union. It is a CGAL Open Source Project. This software is available in two types of licenses. One under the Open Source Licenses (LGPL or QPL) when used for open source software development. The other covers the commercial or industrial customers, including academic research. This library covers several fields in algorithms design: geometric kernel for basic operations in Computational Geometry; arithmetic and algebra operations; Convex Hulls, Voronoi, Delaunay Triangulations algorithms; mesh generations; geometry processing, and much more. CGAL depends on the BOOST library. Its current and stable version is 3.6 (March 2010). It is available for several platforms: Linux, Solaris, Mac OS and MS Windows. One can download it in [80].
50 Boost It is a collection of Open Source libraries developed in C++ language. It has been developed with the purpose of an extension of C++ Standard Library functionalities. It has two types of licensing: commercial and non commercial use. Its current version is 1.43.0 (March 2010). It is available for download in [81]. BOOST makes extensive use of Generic Programming (Templates). It is composed by more than 80 libraries which cover: linear algebra, multi-threading, image processing, regular expressions, string and text processing, containers, iterators, generic algorithms, concurrent programming, safe data structures, math and numeric operations and a list of miscellaneous. TRIANGLE It is a software library developed in C language by Jonathan Richard Shewchuck at Carnegie Mellon University as part of the Quake project. It is available for download in [82]. The current version is 1.6 (July 2005). TRIANGLE generates exact Delaunay Triangulations, Constrained Delaunay Triangulations, Conforming Delaunay Triangulations, Voronoi Diagrams and High-Quality Meshes. It is copyrighted by the author and may not be sold or included in commercial products without an author license. When running, TRI- ANGLE command switches are allowed to filter the referred generations. Working with input and output files is available. The files could be text formatted or image files when generating the structures referred above. Gathan This is a software scheme for reconstruction of curves with corners developed in C++ and its code is available for download [30]. It uses TRIANGLE for the generation of the Voronoi diagrams and Delaunay Triangulations. It is copyrighted software. One can use it for academic proposal. This scheme has a graphics interface for input the points. Load text formatted files are available. By choice the scheme allows the user to select the algorithms and change their parameters for computing. The available algorithms are: Crust, NN Crust, Conservative Crust, Gathan old and GathanG. After the reconstruction it is possible to save the results in two types of files: text formatted files (which are the points listing) and.xfig files, which are image format files. Shaprox This is a software scheme developed in September 2005 [28] in C++ Language. It has a graphic interface where points can be added by the clicking of the mouse button. It includes the C++ implementation of the adaptation algorithm [78] used to compute an polygon which represents a given point set.
51 Sharec It is an extension scheme of the previous Shaprox created in June 2007 [29]. It is also developed in C++ Language. It is a simple shape recognition scheme. It has also a graphic interface for adding points and choose several options and parameters. It has both a simple simplification process and a recognizer process. It includes also the adaptation algorithm previously used in Shaprox.
52 5. Conclusions and future work 5.1. Conclusions We have reached the part for researching algorithms available for doing the job. Full implementation in this kind of project is out of the scope of this Master s project (there is not enough time for do that!). It is a lot of work. For instance, for each part of the scheme, that is, the implementation of only one of the algorithms used is needed more hours that we have to do this project. So that, this project is composed by parts already implemented by other people (Sharec and Gathan) which are working in a robust way. 5.2. Future work The objective of a PhD thesis is to develop an entire project but some companions are needed. Our intention will be the implementation both the parts that we described in this Master s thesis and other including parts like intersection and overlapping shapes that are still a challenge. Thus, this project will follow the steps of the proposed model that consists of several parts that could be developed independently, as part of autonomous stand-alone projects, which interconnect at the end: Graphical Interface, Reading and writing data, Clustering and Ordering, Simplification, Guessing and Adaptation. The development of the graphical interface should be based on a graphics library that offers maintenance conditions, with periodic updates of its components. The gtkmm and Qt are two good examples. It should be constructed parameterized to allow easier upgrades or modifications. Among the various characteristics that this interface will have we should highlight two. One is the possibility of customizing options available according to the will and pleasure of the user. Another point is the implementation of the editing features of points and figures generated on the screen and thus be able to compare results more quickly. In the Reading (and Writing) of data part some methods will be implemented for using various types of file format (text and image). Will be affected the most common types, including those used in graphics applications for vector drawing and image editing, to allow interaction with various existing applications. The Clustering and Ordering part will include some of the algorithms studied. Our research found that, until now, the algorithm which obtains the best results is the Gathan when compared with previous algorithms. This algorithm should be re-implemented since the original version uses the library Triangle. The last update of Triangle was in 2005 and it is no more guaranteed. In its place we will use the CGAL library, a library that is robust and efficient treatment of data types and many algorithms available. Although this algorithm shows good results in reconstruction of curves, collections of curves and sharp corners, it still does not address the intersections and the overlapping shapes. We will give particular importance to address these issues. The Simplification part
will implement Kolesnikov s algorithm cited in section 3, supported the article in which he describes (fast reduced search with bounding corridor). At the recognition of shapes (Guessing) part, besides of a deterministic classifier we will implement a classifier based on data collected in previous phases which will fit into existing models, and thus, can better predict the form or forms present on the set of points In part of adaptation, we ll use the algorithm that is already used in applications above. Special attention should also be given to the part of the system evaluation in action, with the design of methods for testing. 53
54 Acknowledgement I like to thank to LIA Research Group for supporting this work. I owe many thanks to David Reboiro Jato and Pedro Silva Calveiro for providing the code of Shaprox and Sharec. I am heartedly thankful to my Master s thesis advisor, Arno Formella, whose encouragement, guidance and support from the initial to the final level enabled me to develop an understanding of the subject.
55 References 1. H. Stone. Approximation of curves by line segments. Mathematics of Computation, 15(73):40 47, 1961. 2. R. Bellman. On the approximation of curves by line segments using dynamic programming. Communications of the ACM, 4(6):284, 1961. 3. B. Gluss. Further remarks on line segment curve-fitting using dynamic programming. Communications of the ACM, 5(8):441 443, 1962. 4. C.L. Lawson. Characteristic properties of the segmented rational minmax approximation problem. Numerische Mathematik, 6(1):293 301, 1964. 5. MG Cox. Curve fitting with piecewise polynomials. IMA Journal of Applied Mathematics, 8(1):36, 1971. 6. A. Cantoni. Optimal curve fitting with piecewise linear functions. IEEE Transactions on Computers, pages 59 67, 1971. 7. J.C. Perez and E. Vidal. Optimum polygonal approximation of digitized curves. Pattern Recognition Letters, 15(8):743 750, 1994. 8. J.M. Chen, J.A. Ventura, and C.H. Wu. Segmentation of planar curves into circular arcs and line segments. Image and Vision Computing, 14(1):71 83, 1996. 9. P.S. Heckbert and M. Garland. Survey of polygonal surface simplification algorithms, 1997. 10. C.C. Tseng, C.J. Juan, H.C. Chang, and J.F. Lin. An optimal line segment extraction algorithm for online Chinese character recognition using dynamic programming. Pattern Recognition Letters, 19(10):953 961, 1998. 11. K. Mori, K. Wada, and K. Toraichi. Function Approximated Shape Representation using Dynamic Programing with Multi-Resolution Analysis. ICSPAT 99, 1999. 12. M. Salotti. Improvement of Perez and Vidal algorithm for the decomposition ofdigitized curves into line segments. In 15th International Conference on Pattern Recognition, 2000. Proceedings, volume 2, 2000. 13. A. Kolesnikov and P. Fränti. A fast near-optimal algorithm for approximation of polygonal curves. 16:335 338, 2002. 14. A. Kolesnikov and P. Fränti. Reduced-search dynamic programming for approximation of polygonal curves. Pattern Recognition Letters, 24(14):2243 2254, 2003. 15. A. Kolesnikov and P. Fränti. Polygonal approximation of closed contours. Image Analysis, 2749:409 417, 2003. 16. F. Attneave. Some informational aspects of visual perception. Psychological review, 61(3):183 193, 1954. 17. Jack Sklansky and Victor Gonzalez. Fast polygonal approximation of digitized curves. Pattern Recognition, 12(5):327 331, 1980. 18. D.H. Douglas and T.K. Peucker. Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartographica: The International Journal for Geographic Information and Geovisualization, 10(2):112 122, 1973. 19. T. Pavlidis and S.L. Horowitz. Segmentation of plane curves. IEEE transactions on Computers, 100(23):860 870, 1974. 20. L.S. Davis and A. Rosenfeld. Curve segmentation by relaxation labeling. IEEE Transactions on Computers, pages 1053 1057, 1977. 21. J. Hershberger and J. Snoeyink. Speeding up the Douglas-Peucker line simplification algorithm. 1:134 143, 1992. 22. A. Pikaz and I. Dinstein. Optimal polygonal approximation of digital curves* 1. Pattern Recognition, 28(3):373 379, 1995.
56 23. T.Y. Phillips and A. Rosenfeld. An ISODATA algorithm for straight line fitting. Pattern Recognition Letters, 7(5):291 297, 1988. 24. F. Glover and M. Laguna. Tabu search, Modern heuristic techniques for combinatorial problems, 1993. 25. P.Y. Yin. A new method for polygonal approximation using genetic algorithms. Pattern Recognition Letters, 19(11):1017 1026, 1998. 26. U. Vallone. Bidimensional shapes polygonalization by ACO. Ant Algorithms, pages 79 101, 2002. 27. P.Y. Yin. Ant colony search algorithms for optimal polygonal approximation of plane curves. Pattern Recognition, 36(8):1783 1797, 2003. 28. David Reboiro Jato. Cálculo de un polígono para que represente a un conjunto de puntos de forma óptima y extensión del algoritmo para optimizar segmentaciones de imágenes. proyecto fin de carreira, inx-117, escuela superior de ingeniería informática, universidade de vigo, biblioteca, september 2005. 29. Pedro Silva Calveiro. Reconocimiento de formas a partir de una nube de puntos. proyecto fin de carreira, eni-143, escuela superior de ingeniería informática, universidade de vigo, biblioteca, june 2007. 30. http://www.cse.ohio-state.edu/ tamaldey/curverecon.htm. 31. P.J. Schneider and D.H. Eberly. Geometric tools for computer graphics. Morgan Kaufmann Pub, 2003. 32. J.E. Goodman and J. O Rourke. Handbook of discrete and computational geometry. Chapman & Hall, 2004. 33. TH Cormen, CE Leiserson, RL Rivest, and C. Stein. Introduction to Algorithms. MIT Press, McGraw-Hill, second edition edition, 2001. 34. D.R. Chand and S.S. Kapur. An algorithm for convex polytopes. Journal of the ACM (JACM), 17(1):78 86, 1970. 35. RL Graham. An efficient algorith for determining the convex hull of a finite planar set. Information Processing Letters, 1(4):132 133, 1972. 36. RA Jarvis. On the identification of the convex hull of a finite set of points in the plane. Information Processing Letters, 2(1):18 21, 1973. 37. W.F. Eddy. A new convex hull algorithm for planar sets. ACM Transactions on Mathematical Software (TOMS), 3(4):398 403, 1977. 38. A. Bykat. Convex hull of a finite set of points in two dimensions. Information Processing Letters, 7(6):296 298, 1978. 39. FP Preparata and SJ Hong. Convex hulls of finite sets of points in two and three dimensions. Comm. ACM, 20:87 93, 1977. 40. AM Andrew. Another efficient algorithm for convex hulls in two dimensions. Information Processing Letters, 9(5):216 219, 1979. 41. M. Kallay. The complexity of incremental convex hull algorithms in Rd. Information Processing Letters, 19(4):197, 1984. 42. G. Voronoi. Nouvelles applications des paramètres continus à la théorie des formes quadratiques. Premier mémoire. Sur quelques propriétés des formes quadratiques positives parfaites. Journal für die reine und angewandte Mathematik (Crelle s Journal), 1908(133):97 102, 1908. 43. A. Bowyer. Computing dirichlet tessellations. The Computer Journal, 24(2):162, 1981. 44. DF Watson. Computing the n-dimensional Delaunay tessellation with application to Voronoi polytopes. The computer journal, 24(2):167, 1981. 45. S. Fortune. A sweepline algorithm for Voronoi diagrams. Algorithmica, 2(1):153 174, 1987.
46. http://www.pi6.fernuni-hagen.de/geomlab/voroglide/index.html.en. 47. http://kam.mff.cuni.cz/ ludek/algovision/algovision.html. 48. B. Delaunay. Surla sphere vide. Izvestia Akademii Nauk SSSR, VII Seria, Otdelenie Matematicheskii i Estestvennyka Nauk, 7(6):793 800, 1934. 49. K.R. Gabriel and R.R. Sokal. A new statistical approach to geographic variation analysis. Systematic Biology, 18(3):259, 1969. 50. D.W. Matula and R.R. Sokal. Properties of Gabriel graphs relevant to geographic variation research and the clustering of points in the plane. Geographical Analysis, 12(3):205 222, 1980. 51. F.P. Preparata and M.I. Shamos. Computational geometry: an introduction. Springer, 1985. 52. D. Eppstein, M.S. Paterson, and F.F. Yao. On nearest-neighbor graphs. Discrete and Computational Geometry, 17(3):263 282, 1997. 53. Godfried T. Toussaint. The relative neighbourhood graph of a finite planar set. Pattern Recognition, 12(4):261 268, 1980. 54. Kenneth J. Supowit. The relative neighborhood graph, with an application to minimum spanning trees. J. ACM, 30(3):428 448, 1983. 55. J. Katajainen, O. Nevalainen, and J. Teuhola. A linear expected-time algorithm for computing planar relative neighbourhood graphs. Information Processing Letters, 25(2):77 86, 1987. 56. JW Jaromczyk and GT Toussaint. Relative neighborhood graphs and their relatives. Proceedings of the IEEE, 80(9):1502 1517, 1992. 57. A. Lingas. A linear-time construction of the relative neighborhood graph from the Delaunay triangulation. Computational Geometry, 4(4):199 208, 1994. 58. N. Amenta, M. Bern, and D. Eppstein. The crust and the beta-skeleton: combinatorial curve reconstruction. Graphical Models and Image Processing, 60(2):125 135, 1998. 59. H. Edelsbrunner, D. Kirkpatrick, and R. Seidel. On the shape of a set of points in the plane. IEEE Transactions on Information Theory, 29(4):551 559, 1983. 60. http://cgm.cs.mcgill.ca/ godfried/teaching/projects97/belair/alpha.html. 61. H. Edelsbrunner. Alpha Shapes a Survey. 62. D.G. Kirkpatrick and J.D. Radke. A framework for computational morphology. Computational geometry, 85:217 248, 1985. 63. T.K. Dey and R. Wenger. Reconstruction curves with sharp corners. page 241, 2000. 64. T.K. Dey and R. Wenger. Fast reconstruction of curves with sharp corners. International Journal of Computational Geometry and Applications, 12(5):353 400, 2002. 65. T.K. Dey and P. Kumar. A simple provable algorithm for curve reconstruction. page 894, 1999. 66. N. Amenta, M. Bern, and M. Kamvysselis. A new Voronoi-based surface reconstruction algorithm. page 421, 1998. 67. O. Boruvka. On a minimal problem. Prace Moraské Pridovedecké Spolecnosti, 3, 1926. 68. R.C. Prim. Shortest connection networks and some generalizations. Bell system technical journal, 36(6):1389 1401, 1957. 69. J.B. Kruskal Jr. On the shortest spanning subtree of a graph and the traveling salesman problem. Proceedings of the American Mathematical society, 7(1):48 50, 1956. 70. S. Arora. Polynomial time approximation schemes for Euclidean traveling salesman and other geometric problems. Journal of the ACM (JACM), 45(5):782, 1998. 57
58 71. G. Robins and A. Zelikovsky. Improved Steiner tree approximation in graphs. In Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms, page 779. Society for Industrial and Applied Mathematics, 2000. 72. P. Crescenzi and V. Kann. A compendium of NP optimization problems, 1998. 73. J. Giesen. Curve reconstruction, the traveling salesman problem, and Menger s theorem on length. Discrete and Computational Geometry, 24(4):577 603, 2000. 74. E. Althaus and K. Mehlhorn. TSP-based curve reconstruction in polynomial time. pages 686 695, 2000. 75. H. Imai and M. Iri. Computational-geometric methods for polygonal approximations of a curve. Computer Vision, Graphics, and Image Processing, 36(1):31 41, 1986. 76. A. KOLESNIKOV. Efficient algorithms for vectorization and polygonal approximation. 2003. 77. P.L. Rosin. Techniques for Assessing Polygonal Approximationsof Curves. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(6):659 666, 1997. 78. U.M. García-Palomares and J.F. Rodríguez. New sequential and parallel derivative-free algorithms for unconstrained minimization. SIAM Journal on optimization, 13:79, 2002. 79. http://www.algorithmic-solutions.com/leda/ledak/index.htm. 80. http://www.cgal.org/, June 2010. 81. http://www.boost.org/. 82. http://www.cs.cmu.edu/ quake/triangle.html.