ART Extension for Description, Indexing and Retrieval of 3D Objects

ART Extension for Description, Indexing and Retrieval of 3D Objects Julien Ricard, David Coeurjolly, Atilla Baskurt LIRIS, FRE 2672 CNRS, Bat. Nautibus, 43 bd du novembre 98, 69622 Villeurbanne cedex, France julien.ricard@liris.cnrs.fr david.coeurjolly@liris.cnrs.fr atilla.baskurt@liris.cnrs.fr Abstract This paper presents a new three-dimensional shape descriptor, called 3D angular radial transform. This descriptor is an extension of the 2D region base shape descriptor proposed by MPEG-7, the Angular Radial Transform (ART). We propose to generalize the ART in threedimensional to index 3D models.. Introduction The technical 3D model databases always grow up since a score of years and the start of the computed-aided design. The engineering laboratories and the design offices always increase the number of 3D solid objects and the current industrial estimates point to the existence of over 3 billion of CAD models [7]. The number of models requires a content-based mining with indexing and retrieval processes. The communities have made great strides in image management, but few works exist on 3D data. In the framework of a partnership with the car manufacturer, Renault, we investigate the possibilities to make a fast 3D descriptor to index a great technical database. The indexing of a huge 3D object database requires a small descriptor size and a small indexing and retrieval cost, to guarantee fast answers. To index 3D objects, 3D moment approaches exist and the spherical harmonic representation can be cited [3, 5]. On the 2D images, the MPEG-7 standard proposes to index images using a region based shape descriptor, the Angular Radial Transform (ART) [4, 6]. This shape descriptor has many properties: compact size, robust to noise and scaling, invariant to rotation, ability to describe complex objects. This paper presents a new 3D shape descriptor, the 3D Angular Radial Transform (3D ART). This descriptor is a generalization of the ART to 3D objects and we would like to save the 2D ART properties. The rest of the paper will be organized as follows: first, a survey on 3D shape matching is made, then, we present the 3D ART descriptor, finally, experiments and results are presented. 2. Survey of recent methods The moment approaches can be defined as projections of the function defining the object onto a set of characteristic functions to the given moment. These approaches are used in 2D pattern recognition with several 2D moments: geometrical, Legendre, Fourier-Mellin, Zernike, pseudo-zernike moments []. Some of these moments have been extending in 3D: 3D Fourier [2], 3D Wavelet [8], 3D Zernike [] and a spherical harmonic decomposition was used by Vranic and Saupe []. The spherical harmonic descriptor (SH) is an efficient method and is used in last section to estimate the proposed method. The SH descriptors were introduced by Funkhouser et al. [3]. After a centering step, the spherical harmonic descriptors decompose 3D shapes into irreducible set of rotation independent components by sample the three dimensional space as concentric shells, where the shells are defined by equal radial intervals. The spherical functions are decomposed as a sum of its first 6 harmonic components [5], analogous to a Fourier decomposition into different frequencies. Using the fact that rotations do not change the norm of the harmonic components, the signature of each spherical function is defined as a list of these 6 norms. Finally, these different signatures are combined to obtain a 32 6 signature for the 3D model. During the search of a database, the similarity of objects is calculated as the Euclidean distance between these vectors. 3. 3D Angular radial transform In this section, we generalize the MPEG-7 s angular radial transform to 3D space. 3.. 3D ART definition To apply the 3D ART, the objects are represented in spherical coordinates where φ is the azimuthal angle in the xy-plane from the x-axis, θ is the polar angle from the z- axis and ρ is the radius from a point to the origin. The 3D

ART is a complex-orthogonal unitary transform defined on a unit sphere. The 3D ART coefficients are defined by: Z 2π Z π Z Vnm (ρ, θ, φ)f (ρ, θ, φ)ρdρdθdφ Fnmθ mφ = θ mφ () where Fnmθ mφ is an ART coefficient of order n, mθ and mφ, f (p, θ, φ) is a 3D object function in spherical coordinates and Vnmθ mφ (ρ, θ, φ) is a 3D ART basis function (BF). The 3D BF are separable along the angular and the two radial directions: Vnmθ mφ (ρ, θ, φ) = Amθ (θ)amφ (φ)rn (ρ) (2) As in 2D, the radial basis function is defined by a cosine function: n= Rn (ρ) = (3) 2 cos(πnρ) n 6= The angular basis functions are defined by exponential functions to achieve rotation invariance: Amθ (θ) Amφ (φ) = = 2π exp (2jmθ θ) 2π exp (jmφ φ) (4) Two angular basis functions are defined to keep the basis functions continuity along both θ and φ values. The values of the parameters n, mθ and mφ are trade-offs between efficiency and accuracy. The choice of the values of 3D ART parameters, n, mθ and mφ, is made by the compute of the Recall values for some different value of this parameters. On the technical database presented in section 4, we finally choose n = 3, mθ = 5 and mφ = 5. The real parts of the 3D ART BF are shown in figures and 2. Figure 2. Real parts of 3D ART BF. The similarity measure is computed using a L norm between two shapes described by the 3D ART descriptor: n mθ mφ d(q, I) = X kart 3DQ [i] ARRT 3DI [i]k (5) i= where Q and I represent a query object and object of the database and ART 3D is the array of 3D ART descriptor values. 3.2. Indexing process An important property of the 2D ART is the rotational invariance. The rotation representation in polar coordinates can be express as the sum of angular components: Rot (ρ, θ) α (ρ, θ + α) (6) Thus, that does not modify the function norm An (θ) and the ART descriptor. In 3D, unspecified rotations can not be expressed as the sum of constant values on the angular components. The exponential functions of Ami functions and the object descriptor are changed. The only rotations that are invariant by this description are the rotations around the z axis. These rotations do not modify the θ-components Figure. Real parts of 3D ART BF. 2

of the object points. Hence, the rotations can be expressed as a linear sum on the φ-components that is invariant. To solve that problem, we align the objects according to their principal axis. A Principal Components Analysis (PCA) is applied to obtain the principal directions of the objects and we align the objects along the z-axis. The alignment is made only along the z-axis. The Fig. 3 shows the indexation process. Hence, before we project the 3D image onto a BF, the objects are pre-processed as follows: first, the objects are discritized in a grid such that the voxels are separated into interior and exterior elements of the object. The discretization is used to compute the parameters of centering, scaling and alignment to the z-axis. Then, the discretized object is aligned according to these parameters: the 3D object is centered onto the center of gravity and scale up the object such that the object boundary touches the grid boundary. The pre-processing step makes the 3D ART be robust to translation and to scaling. Finally, the discretization is projected to the 3D ART BF to obtain the 3D ART coefficients. These coefficients are normalized and define the 3D shape descriptor. 4. Experiments Figure 3. Indexing process. Two set of experiments are made. First, we have to fix the parameter values: the number and the size of the discretization. Then, we evaluate the robustness of the method. Finally, the 3D ART is compared to the Spherical Harmonic descriptor (SH) [3, 5]. The tests are made on two 3D model databases: the Princeton Shape Benchmark [9] and a Renault database (Fig. 4 shows examples of 3D models). The Renault database is a technical database which contains mechanical models. Hence the shapes of difference models in a given class are similar. The Princeton database contains high level semantic classes where the objects in a same class are more heterogeneous. Figure 4. Example of 3D models: first row: Renault database, second row: Princeton Shape Benchmark. To fix the parameter values, the recall values are compared. Twelve values of the parameters n, m θ and m φ were evaluated. The Fig. 5.a shows that the better results are obtained for n = 3 and m θ = m φ = 5. The Fig. 5.b shows the same experiment with different values of the size of the discretization S. The better result are obtained on the technical database with the parameter value S = 64. We use this value in the rest of this work. This values is also suggested in [5] for the SH computation. To evaluate the robustness of the process, we distort a 3D object according to scaling, rotation, translation and noise. The table shows the maximum and the mean distance obtained for these four distortions. For each distortion, we create a set of 3D objects and for all the objects we computed the distance with the original objects. The translation has no effect on the distance, because the pre-processing centers and scales the objects. For the same reasons, the scale distortion has small effects due to digit artifacts, the maximum distance between the scaled objects are.6 when a mean distance between two objects of the same class is around 3. The obtained distances are smaller than intra-class distances and the classification will be the same one. The rotation distortion test is a set of rotations around the three axes with random angles and gives a maximum distance of.272 and a mean distance of.75. The noise distortion is a random move of vertices of the object; each vertex is moved of a random Gaussian vector. This distance is a percentage of the object size. If this distance is higher than % the surface of the object is very distort but the similarity measure is.6 and the object are still well classified. The Fig. 6 shows distorted objects by the noise distortions. 3

Distort Translation Scale Rotation Noise Max dist..6.272 2.27 Mean dist..3.75.2 Table. Distance obtained for several distortions. Figure 5. Recall values to set up parameters. Figure 6. Example of noise distortions for three distance values: %, 5% and % Figure 7. Recall values on Princeton and Renault databases. 5. Conclusion A second experiment is set up to compare the 3D ART to the Spherical Harmonic descriptor (SH). This experiment was made on the two models database. The figures 7.a and 7.b show the recall values for SH and 3D ART descriptors on the two databases. On the Princeton database (Fig. 7.a), the SH method gives a better description than the ART method. The results on the Renault database are better with ART (Fig. 7.b). This shows that description ART gives better results when the objects of a same class are similar. The cost of computation and the size of the descriptor are also significant to compare the methods (Table 2). The 3D ART indexing computation time is 2.5 times less than a SH indexing and the descriptor size and the cost of the similarity measure is approximately 7.8 times less. These differences are due to the fact that the ART does not make a frequency transformation and stays in a real space whereas the spherical harmonic descriptor makes a frequency transformation. This paper proposes a new 3D shape descriptor, the 3D ART. This 3D transformation and the indexing process make robust 3D shape descriptors. The 3D shape descriptor is robust to translation, scaling, multi representation (remeshing, weak distortions), noise and 3D rotation. The proposed shape descriptor fulfills the requirements induced by technical model database analysis: robustness and accuracy of the indexing and retrieval processes and fast similarity computation. As future work, we plan to investigate the possibilities to make a 2D/3D retrieval with the 3D ART. References [] M. Elad, A. Tal, and S. Ar. 3d zemike moments and zernike affine invariants for 3d image analysis. In IIth Scandinavian Conf. on Image Analysis, 999. [2] M. Elad, A. Tal, and S. Ar. Content based retrieval of vrml objects - an iterative and interactive approach. In The 4

Indexing time Descriptor size SH l 544 3D ART 4 74 Table 2. Size (in floating numbers) and indexing time (in seconds) comparison between 3D ART and spherical harmonic representation. Sixth Eurographics Workshop in Multimedia, pages 97 8, 2. [3] T. Funkhouser, P. Min, M. Kazhdan, J. Chen, A. Halderman, D. Dobkin, and D. Jacobs. A search engine for 3d models. ACM Transactions on Graphics, 22():83 5, Jan. 23. [4] S. Jeannin. Mpeg-7 Visual part of experimentation Model Version 9.. In ISO/IEC JTC/SC29/WG/N394, 55th Mpeg Meeting, Pisa, Jan. 2. [5] M. Kazhdan, T. Funkhouser, and S. Rusinkiewicz. Rotation invariant spherical harmonic representation of 3d shape descriptors. Symposium on Geometry Processing, June 23. [6] W.-Y. Kim and Y.-S. Kim. A new region-based shape descriptor. In TR 5-, Pisa, Dec. 999. [7] D. McWherter, M. Peabody, and W. C. Regli. Clustering techniques for databases of CAD models. Technical report, Drexel University, Sept. 2. [8] E. Paquet and M. Rioux. Influence of pose on 3-d shape classification. In SAE International Conference on Digital Human Modeling for Design and Engineering, Dearborn, MI, USA, June 2. [9] Princeton shape benchmark. http://shape.cs.princeton.edu/ benchmark/index.cgi. [] C.-H. Teh and R. T. Chin. On image analysis by the methods of moments. IEEE Trans. on Pattern Analysis and Machine Intelligence, (4):496 53, 988. [] D. V. Vranic and D. Saupe. Description of 3d-shape using a complex function on the sphere. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME 22), Lausanne, Switzerland, August 22. 5