DigiLog Space Generator for Tele-collaboration in an Augmented Reality Environment

DigiLog Space Generator for Tele-collaboration in an Augmented Reality Environment Kyungwon Gil, Taejin Ha, Woontack Woo KAIST UVR Lab, Daejeon 305-701, S. Korea {kgil, taejinha, wwoo}@kaist.ac.kr Abstract. Tele-collaboration can allow users to connect with a partner or their family in a remote place. Generally, tele-collaborations are performed in front of a camera and screen. Due to their fixed positions, these systems have limitations for users who are moving. This paper proposes an augmentedreality based DigiLog Space Generator. We can generate interested space and combine remote space in real time ensuring movement. And our system uses reference object to calculate scale of space and coordinates. Scale and coordinates are saved at Database(DB) and used for realistic combination of space. DigiLog Space Generator is applicable to many AR applications. We discuss the experiences and limitations of our system. Future research is also described. Keywords: Augmented Reality; Tele-collaboration; Human-Computer Interaction; 1 Introduction Tele-collaboration technology aims at connecting users in different locations through a network to achieve common goals efficiently by working together. Typical telecollaboration at present involves a user who communicates with remote participants and shares digital information in front of the screen. There are many applications for tele-collaboration: Users "attend" Web conferences with a company partner far away, using video conference or data conference systems; teachers and students look at the same data and study together in their own home or office; and so on. In recent years, tele-medicine enables a distant doctor to provide medical treatment to a patient in his or her own home. These tele-collaboration technologies can overcome the spatial limitations of traditional collaboration, because it can allow collaborators to transcend space. Moreover, tele-collaboration requires no expenditures of time and money for attending meetings. However this experience of remote collaboration is mainly possible within a spatially limited environment where users do not move. For example, in a 2-D display environment, users in fixed location can write and draw together with a remote participant or show marker based-ar information using a monitor or a large display [1-3]. In 2.5-D display environment, users can collaborate in a limited 3-D environment using a curved screen that has been installed in a fixed location [4]. On the other hand, multiple cameras or depth cameras are used in a 3-D display environment [5-7], but tele-collaborations studied are still possible in front of a fixed

camera or screen. So we need a noble method that ensures user's mobility and allows perfect 3-D interaction. This paper suggests an augmented reality(ar) based DigiLog Space Generator. DigiLog Space Generator can generate 3-D collaboration space simply in an arbitrary and common environment. DigiLog Space is a combination of the physical world and a mirror world (digitization of the real world) [8]. The real world and mirror world are connected bidirectionally. In DigiLog Space, information is shared in real-time. DigiLog Space Generator generates these DigiLog Spaces and allows telecollaboration through converging DigiLog Spaces. That is, the DigiLog Space Generator is composed of two parts. One is the technology of generating DigiLog Space; the other is the technology of combining several DigiLog Spaces. The overall concept and procedure is illustrated in Figure 1. When we generate space in the real world, we also generate a mirror world. A mirror world is not physical space, but virtual space. A mirror world does not have a physical scale, and thus uses an arbitrary scale. If the scale of a mirror world and a real world are different, remote space combines with the mirror world unnaturally, because remote space and virtual information in the space are bigger or smaller than the real things. To solve this problem, we calculated the scale of generated space and saved the scale. The user combines their own space and the remote space realistically using the physical scale. We experimented and discussed the accuracy of generated space. The remainder of this paper is organized as follows: Section 2 explains the system design of the DigiLog Space Generator. Section 3 deals with the implementation details and experimental results of in situ space modeling and combination. Finally, the conclusion and future directions of the DigiLog space generator are summarized in Section 4. Figure 1: DigiLog Space Generation and combination. 2 DigiLog Space Generator Figure 2 is a block diagram of a total system. It is composed of DigiLog space generation, DigiLog Space Combination, and a DigiLog Space Management Module. The DigiLog Space Generation Module reconstructs a space in real time through

images from an HMD camera and user input[9]. Then it gains a 3D feature map. Next, we model an object of interest (OoI) as a reference. The modeled object has local coordinates and a scale. Then a plane is made and extruded through user input. When the system generates space using the 3D feature map, we can track the space. If space is generated, the space Id, coordinates, scale and pose are stored at Digilog Space Management. Then the system performs a combination step for bring remote space. When a reference object is detected, the system makes a request for space sharing to DigiLog Space Management. Then the system can bring remote space at DB, and combine multiple spaces based on the reference object. Finally, the user can see the AR information and communicate with a remote partner. Figure 2: Overall procedure of DigiLog Space Generator. We generate a partial space of interest according to the need to ensure the user s mobility and share information efficiently. Because the generated space contains data for camera tracking, the HMD camera can be tracked at the space. Virtual information that we want to share can be registered in the space accurately at the HMD camera. Generally, conventional methods of space generation are conducted offline because it takes too much time to reconstruct space. These offline methods of space reconstruction are separated from the step of information sharing. Users can share information only at a fixed space that is already made. To define DigiLog space as an arbitrary environment, we propose a method of simple space modeling in real time for DigiLog Space generation. Our system captures environmental images using an HMD camera and reconstructs the environment with a 3D feature map based on the structure-from-motion (SFM) method. A DigiLog Space is then defined by a user in the physical environment. We model only interested partial space for real time and share information selectively. Information outside the space can be hidden using an occlusion effect by generating partial space. We can define a space range that we want to share.

The modeling of an object of interest (OoI) is an important cue to combine remote space and bring augmented information. First we set the reference coordinates on the OoI, because the reference coordinates of space are needed to bring in the remote shared space and AR information. When the user selects a bottom-line reference object, reference coordinates are made at the bottom left corner of an OoI by refining coordinates. Therefore, the OoI is modeled and tracked separately based on the generated DigiLog space, with its own coordinate system. For this, multiple featurevocabulary trees are managed for the space and the objects[10]. If we input the physical scale of the OoI, our system calculates the scale of a generated space using scale of OoI. We know the physical scale of the mirror world and save that scale at DB. We assume that the user and partner use common OoI as reference objects. The modeled, common OoI has the same feature map, scale, and coordinates, so we can combine difference spaces suitably. In other words, multiple DigiLog spaces are synchronized through connected coordinates. Therefore, sharing of AR information in real time is possible in a combined space. The whole scenario of the DigiLog space generator is shown in Figure 3. User A, wearing video see-through HMD, generates his own space and models OoI in situ with the HMD camera and a handheld input device. Input from user A in the real world makes the mirror world A at the same time. So user A, wearing HMD, can see the real world with the mirror world. Then user B combines her DigiLog space with remote space A based on a modeled dragon poster as a reference object created by the DigiLog Space Generator. User B, wearing HMD, can see both the virtual information from herself and from user A. Therefore user B can share 3D information and live communication in the 3D environment while walking. But there is a limitation in that the reference object is always seen in the user s view. 3 Experiment and Implementation To verify the performance of the DigiLog Space Generator, we measured the accuracy of generated space. We compare scale of physical space and generated space(mirror world) for measuring accuracy. Minimizing the error between of spaces is important to combine realistically. In our experiment, we set the partial space for the experiment as shown in Figure 4. The yellow box shape indicates our partial space. To make the space, the user generates coordinates at a reference object (the poster) through input. Next, the user touches four corners of the front wall to generate a plane. Finally, the plane is extruded by user interaction. Then scale of generated space is calculated and stored at DB. The user conducts the generating-space experiment 10 times. The scale of real space is 180 (x axis) x 170 (y axis) x 90 (z axis) cm. In order to generate a spatially stable space, a reference image must show all of the HMD camera view. Therefore, the user creates a space at a spot that is 4 meters away from the reference image. For stable tracking, the user only has to move the translation. The experiments were performed in a typical indoor environment. A video see-through

HMD was used with 800 600 pixel resolution. A camera on the HMD captured 30 image frames per second. Our system was implemented using 1 OpenCV and 2 OSG. Figure 3: A whole scenario of the DigiLog Space Generator: (a) In-situ DigiLog Space Generation: 3D features extraction and space extrusion; (b) In-situ DigiLog Space Combination: a user brings remote shared space and AR information. Optionally in order to view the remote partner, a depth camera on a stand was used to detect and segment a person. 1 Open computer vision library, http://sourceforge.net/projects/opencvlibrary/. 2 Open scene graph library, http://www.openscenegraph.org/projects/osg/.

Figure 4: Experiment environment and steps of generating space. Figure 5(a) shows real space (a gray box) and corner points (blue points). Corner points are made by user input. Blue points mean user input. So, if the difference between corner points and box corner points is small, the generated space is similar to real space. Our generated space is pretty accurate because the blue points are crowded around the box corners. When analyzed the average of x, y, and z values, and the average errors of x and y were under 5 cm. However, the average error of z is relatively large at 22.3cm. This points to the limitation of our system. Maybe user didn t know the point we set at the z axis because the depth value of real space is ambiguous. Also, when the user makes a plane standing at the front only, it is generated aslant because 3D points are created inaccurately. For accuracy, the user should make the plane by moving at a variety of angles. Figure 5: Experiment result: (a) Corner points of the space created by the user. The scale of the target space is 180 (x-axis) x 170 (y-axis) x 90 (z-axis) cm; (b) Distance errors between two spaces in each axis. Fig. 6 shows an implementation of DigiLog space combination. A user combines his or her space with remote space based on modeled objects. This combined DigiLog space enables sharing 3-D information and live communication.

AR information HMD Remote partner DigiLog space generation Collaboration Figure 6: A user exploits a video see-through HMD for modeling, tracking objects of interest, and sharing DigiLog space. And user can collaborate with remote partner with AR information in combined space. 4 Conclusion In this paper, we introduce a novel AR-based tele-collaboration technology. Our method is more effective than a conventional, immovable video-conferencing system using cumbersome equipment and installations. We can generate interested space and combine space in real time. Our system can be used at home, or the classroom, and in other installations. For accurate combination, we use a common reference object. The reference object is important for knowing coordinates and scale of space. So our system can synchronize multiple DigiLog Spaces easily with coordinates and scale information. Currently there are limitations for implementation. Our system does not work well in environments that have repetitive textures, or no textures. And generated space and virtual information are unstable if the OoI is occluded. An ambiguous scale of the z axis is also problematic. For future work we plan to use an RGB-D feature map, including color and depth information, for modeling and tracking. This will make generated space more stable under featureless and variant lighting environments, rather than using RGB feature points only. And if we use a depth value when generating space, the user will have a z scale, obviously. Once our system is updated it can be applicable to many AR applications including experimental education, urban planning, military simulation, collaborative surgery, etc. Acknowledgements This work was supported by the Global Frontier R&D Program on <Human-centered Interaction for Coexistence> funded by the National Research Foundation of Korea grant funded by the Korean Government(MEST) Reference [1] M. Kuechler, and A. M. Kunz, Collaboard: A remote collaboration groupware device featuring an embodiment-enriched shared workspace, 16th ACM international conference on Supporting group work, pp. 211 214, 2010. [2] J. Tang, J. Marlow, A. Hoff, A. Roseway, K. Inkpen, C. Zhao, and X. Cao, Time Travel Proxy: Using Lightweight Video Recordings to Create Asynchronous,

Interactive Meetings, SIGCHI Conference on Human Factors in Computing Systems, pp. 3111-3120, 2012. [3] I. Barakonyi, T. Fahmy, and D. Schmalstieg, Remote collaboration using augmented reality video conferencing, Graphics Interface 2004, pp 89 96, 2004. [4] H. Benko, R. Jota, and A. Wilson, MirageTable: Freehand Interaction on a Projected Augmented Reality Tabletop, SIGCHI Conference on Human Factors in Computing Systems, pp. 199-208, 2012. [5] O. Schreer, I. Feldmann, N. Atzpadin, P. Eisert, P. Kauff, and H. Belt, "3DPresence - A system concept for multi-user and multi-party immersive 3D videoconferencing, CVMP, pp. 321 334, 2008. [6] K. Kim, J. Bolton, A. Girouard, J. Cooperstock, and R. Vertegaal, TeleHuman: Effects of 3D perspective on gaze and pose estimation with a life-size cylindrical telepresence pod, SIGCHI Conference on Human Factors in Computing Systems, pp. 2531 2540, 2012. [7] N. Lehment, K. Erhardt, and G. Rigoll, Interface Design for an Inexpensive Hands-Free Collaborative Videoconferencing System, ISMAR, 2012 [8] T. Ha, H. Lee, W. Woo, DigiLog Space: Real-Time Dual Space Registration and Dynamic Information Visualization for 4D+ Augmented Reality, ISUVR, pp. 22-25, 2012 [9] T. Ha and W. Woo, "ARWand for an Augmented World Builder," IEEE 3DUI, 2013 (in press) [10] K. Kim, V. Lepetitb and W. Woo, Real-time interactive modeling and scalable multiple object tracking for AR, Computers & Graphics, Volume 36, Issue 8, 2012, pp. 945 954, 2012