An Animation Definition Interface Rapid Design of MPEG-4 Compliant Animated Faces and Bodies Erich Haratsch, Technical University of Munich, erich@lis.e-tecknik.tu-muenchen.de Jörn Ostermann, AT&T Labs Research, osterman@research.att.com Abstract Many real-time animation programs including MPEG-4 terminals with face and body animation capabilities, will run a proprietary renderer using a proprietary face or body model. Usually, the animation of a proprietary model is not compatible to the MPEG-4 requirements. Furthermore, the new implementation and modification of animation parameters like smiles or eye brow movement into these renderers is cumbersome and time consuming. In this contribution, a process is proposed that allows the fast definition of animation parameters for proprietary models and their inclusion into proprietary real-time rendering software. REading the proprietary model into any commercially available modeler, this modeler is used to define the behavior of different animation parameters. For each animation parameter, the modified model is stored. The animation definition interface, a model analysis software, compares the original model with the animated model and extracts the essential animation parameters. These parameters are stored in tables and are used by the real-time animation program to generate the designed expression. 1. Introduction Currently, ISO/IEC JTC1/WG11, the same Moving Pictures Experts Group (MPEG) of the International Standardization Organization (ISO) and the International Electrotechnical Commission (IEC) that developed MPEG-1 and MPEG-2, is developing the new standard MPEG-4 [1]. Among other items, MPEG-4 strives to define a standardized interface to allow animation of face and body models within a MPEG-4 terminal [2]. DUe to the fast advances in computer graphics hardware, it is not foreseen that MPEG-4 will standardize the face and body models. Instead, face and body definition parameters (FDP, BDP) are defined for specifying the shape and surface of a model [7]. For the animation of the models, face and body animation parameters (FAP, BAP) are standardized. These animation parameters include low-level parameters like raise left outer eyebrow and tongue roll as well as high-level parameters like joy. Assuming that different terminals allow for models with different degrees of complexity, a process is required that allows the rapid development of models suited for animation. The use of standardized file formats like VRML would allow the use of commonly available modeling software (modelers) like COSMO Worlds or Alias/Wavefront PowerAnimator to design
animations. However, formats like VRML1, VRML2 [3][4][5], and libraries like OpenInventor[6] allow to define animation parameters for transforms like rotation or scaling of rigid objects but not for components of flexibly connected objects. Face and body animation requires flexible deformation. Since this is currently not easily implemented using OpenInventor or VRML 2-based application programming interfaces (API), the real-time renderer must be proprietary. Usually, the real-time render can read and write VRML or OpenInventor files. However, the definitions of animations like smiles is built into the renderer. Convenient editors for defining the animation capabilities are missing. In this contribution, the interface between a modeler (here Alias/Wavefront PowerAnimator) and a realtime renderer (here AT&T's virtual operator) is described that allows the rapid definition, modification and implementation of animation parameters. Since the interface reads VRML files from the modeler, it is independent of the modeler. The interface writes a VRML file and one accompanying table for each defined animation parameter thus making this information easily integrated into proprietary renderers. 2.0 Animation Definition Interface The proposed animation definition interface (ADI) between the modeler and the real-time renderer assumes that the animated models are described as wireframes. VRML 2 wireframes are defined using IndexedFaceSets. The definition of an animation parameter is given in an animation definition table (ADT) to be computed by the ADI. The interface takes as its input several VRML objects describing static models with a topology appropriate for the render [10]. Figure 1 shows how the proposed system is integrated with the modeler and the renderer. The model of the renderer is exported as a VRML file and read into the modeler. In order to design the behavior of the model for one animation parameter, the model is deformed using the tools of the modeler. Usually, restrictions on the topology of the model exist. For simplicity, we assume that the model is deformed only by moving relevant vertices and not by changing its topology. The modeler exports the deformed model as a VRML file.
Figure 1: Animation Definition Interface (ADI): The model is defined in a VRML file, the effects of animation parameters are defined in animation definition tables (ADT) referenceing vertices of the VRML file. The modeler is used to generate VRML files with the object in different animated positions. The renderer reads the VRML file and tables. Then, the model can be animated using animation parameters like MPEG-4 FAPs. The ADI compares the output of the modeler with its input, the model exported from the renderer. By comparing vertex positions of the two models, the vertices affected by the newly designed animation parameters are identified. The ADI computes for each affected vertex a 3D displacement vector defining the deformation and exports this information in an animation definition table. The renderer reads the VRML file of the model and the table in order to learn the definition of the new animation parameter. Now, the render can use the newly defined animation as required by the animation parameters. The amount of deformation contributed by each vertex is defined by the scalar product of the 3D displacement vector as given in the ADT and the actual value of the animation parameter. 2.1 Approximation of Non-Linear Deformations by Straight Lines The converter as described above allows the renderer only to create deformations of moving vertices along the defined 3D displacement vector. While this might be sufficient for simple actions like move left eye brow up, complex motions like smile or tongue roll up are not sufficiently modeled by linearly moving vertices. Therefore, we propose to create several VRML files for different phases of the animation, thus allowing for a piece-wise linear approximation of complex deformations (Figure 2).
Figure 2: An arbitrary motion trajectory is piece-wise linearly approximated For a smile, writing three files with smile=0.3, smile=0.7, and smile=1.0 are sufficient to allow for a subjectively present piece-wise linear approximation of this relatively complex deformation. 2.2 Application Example The above outlined procedure was used to define the entire set of MPEG-4 FAPs for a proprietary face animation renderer. The model is an extension of Parke's model [9]. The FAPs integrate nicely with the model's talking capability [8] (Figure 3) Figure 3: MPEG-4 will aloow the animation of computer graphics heads by synthetic speech and animation parameters. Here serveral frames from the text "Speech synthesis by AT&T"
Figure 4: A smile as defined with the Animation Definition Interface Animated sequences using different personalities will be shown at the conference (Figure 4). Although this example shows only the animation of the wireframe by deformation, this process can be extended to allow the definition of animation parameters for appearance like surface color and texture maps. 2.3 Flexible Deformations on OpenGL Based Graphics Subsystems. Most of the newly available graphics boards for PC and workstations support rendering based on the OpenGL engine. Similarly, VRML 2 browsers and OpenInventor are based on OpenGL [11]. So it is essential to enable real-time deformations of models rendered on an OpenGL engine, and it is imperative to use hardware supported functions of OpenGL as much as possible. OpenGL does not allow to deform wireframes by moving parts of a wireframe or IndexedFaceSet. Therefore, the CPU has to update the vertex positions of the wireframe through the animation parameters as defined in the table [12]. However, we can still take full advantage of the OpenGL rendering engine speed for global motions, lighting, texture mapping, etc. 3. Conclusions A process was defined that allows the rapid definition of new animation parameters for proprietary renderers, even allowing for peculiarities of proprietary models. In a first step, a proprietary model gets animated in a standard modeler. The animated models are saved as VRML files. The output of the Animation Definition Interface is the model and a table describing the new animation parameter. This information is read by the renderer and used whenever the animation parameter is required. The proposed process with the ADI can easily be used to generate new shapes from the original model. 4. References 1. Leonardo Chiariglione (convenor), "MPEG", http://drogo.cselt.stet.it/mpeg. 2. Peter K.Doenges, Tolga K Capin, Fabio Lavagetto, Jörn Ostermann, Igor S.Pandzic, Eirc D.Petajan, "MPEG-4: Audio/Video & Synthetic Graphics/Audio for Mixed Media", Signal Processing: Image Communication, accepted for publication in 1997. 3. "Virtual Reality Modeling Language, Version 2.0", ISO/IEC JTC1 SC24, ISO/IEC CD
14772. 4. "The VRML 2.0 Specification", http://www.vrml.org/vrml2.0/final. 5. J.Hartman, Josie Wernecke, The VRML 2.0 Handbook, Addison Wesley, New York, 1996. 6. Open Inventor Architecture Group, Open Inventor C++ Reference Manual, Addison Wesley, New York, 1994. 7. "SNHC Verification Model 4.0", ISO/IEC JTC/SC29/WG11 N1666, Bristol meeting, April 1997. 8. M.Cohen, D.Massaro, "Modeling coarticulation in synthetic visual speech", in N.M.Thalmann and D.Thalmann, editors, Models and Techniques in Computer Animation, pp.141-155, Springer Verlag, Tokyo, 1993. 9. Frederic I.Parke, Keith Waters, Computer Facial Animation, AK Peter Ltd, Wellesley, Massachusetts, Chapter 6, 1996. 10. L.Chen, J.Ostermann, "Animated talking head with personalized 3D head model", 1997 Workshop on Multimedia Signal Processing", Princeton, NJ, USA, June 1997. 11. Jackie Nieder et al., OpenGL Programming Guide, Addison Wesley, New York, 1993. 12. E.Haratsch, J.Ostermann, "Parameter based animation of 3D head models', submitted to Picture Coding Symposium PCS'97, Berlin, GErmany, September 1997.