A remote multi-sites communication system providing gaze control by body movement

Similar documents
Intuitive Navigation in an Enormous Virtual Environment

HDTV: A challenge to traditional video conferencing?

One-Way Pseudo Transparent Display

A method of generating free-route walk-through animation using vehicle-borne video image

Group Support Systems

Beyond Built-in: Why a Better Webcam Matters

Triple-View: Improving Persuasion in Group Video Conferencing through spatial Faithfulness

Conveying Gaze-Awareness by Using a Faint Light with a Video-Conferencing System

ARTICLE. Sound in surveillance Adding audio to your IP video solution

Supporting Cross-Cultural Communication with a Large-Screen System

High speed 3D capture for Configuration Management DOE SBIR Phase II Paul Banks

2D & 3D TelePresence

Eye-contact in Multipoint Videoconferencing

High-Speed Thin Client Technology for Mobile Environment: Mobile RVEC

How To Compress Video For Real Time Transmission

Applications and Benefits of Ethnographic Research

Life is on. Interact freely. Communicate with confidence. Live without limit. Life is on.

White Paper. Interactive Multicast Technology. Changing the Rules of Enterprise Streaming Video

Avoid the biggest pitfalls with Video Conferencing

Create a Basic Skype* Account. Intel Easy Steps Intel Corporation All rights reserved.

GAZE-2: Conveying Eye Contact in Group Video Conferencing Using Eye-Controlled Camera Direction

Study of Large-Scale Data Visualization

AR-based video-mediated communication:

Development of The Next Generation Document Reader -Eye Scanner-

The Benefits of Real Presence Experience High Definition

2) A convex lens is known as a diverging lens and a concave lens is known as a converging lens. Answer: FALSE Diff: 1 Var: 1 Page Ref: Sec.

3D Te l epr e s e n c e

How To Test An Echo Cancelling Loudspeaker

Video Conferencing System Buyer s Guide

Development and Application of a Distance Learning System by Using Virtual Reality Technology

Applying Videogame Technologies to Video Conferencing Systems

Shape Measurement of a Sewer Pipe. Using a Mobile Robot with Computer Vision

The GAZE Groupware System: Mediating Joint Attention in Multiparty Communication and Collaboration

Faculty of Science and Engineering Placements. Stand out from the competition! Be prepared for your Interviews

VoIP Conferencing Best Practices. Ultimate Guide for Hosting VoIP Conferences. A detailed guide on best practices for VoIP conferences:

VIRTUAL TRIAL ROOM USING AUGMENTED REALITY

360 Degree Action Camera

Motion tracking using Matlab, a Nintendo Wii Remote, and infrared LEDs.

Axis video encoders. Where analog and network video meet.

OptimizedIR in Axis cameras

Video Conferencing 101

Morae. Remote Participant Testing Tips. Release 3. January TechSmith Corporation. All rights reserved.

PHOTOGRAMMETRIC TECHNIQUES FOR MEASUREMENTS IN WOODWORKING INDUSTRY

CHILD S NAME INSERT CHILD S PHOTO HERE

INTERNATIONAL JOURNAL OF ADVANCES IN COMPUTING AND INFORMATION TECHNOLOGY An International online open access peer reviewed journal

Microsoft Lync: Getting Started

Program Visualization for Programming Education Case of Jeliot 3

D E F I N E V E L O D O. Telemedicine Room Design PROGRAM GUIDE. CTEConline.org

Introduction to the Perceptual Computing

Residential District Security Using Home Computers

Frequently Asked Questions

Technical Report An Analysis on the Use of LED Lighting for Video Conferencing

Telepresence vs Video Conferencing Finding the right fit for your business

PROJECT WORKPLACE DEVELOPMENT

Reflection and Refraction

Standex-Meder Electronics. Custom Engineered Solutions for Tomorrow

INTRODUCTION TO RENDERING TECHNIQUES

Immersive Conferencing Directions at FX Palo Alto Laboratory

Design Analysis of Everyday Thing: Nintendo Wii Remote

Videoconferencing Operation Troubleshooting Procedures

PREFETCH VIDEO CONFERENCE OVER LAN THROUGH PC S OR LAPTOPS

4. CAMERA ADJUSTMENTS

WHITE PAPER SMB Business Telephone Systems Options to Ensure Your Organization is Future Ready. By Peter Bernstein, Senior Editor TMCnet.

ARTICLE. Site survey considerations

A PHOTOGRAMMETRIC APPRAOCH FOR AUTOMATIC TRAFFIC ASSESSMENT USING CONVENTIONAL CCTV CAMERA

Unit A451: Computer systems and programming. Section 2: Computing Hardware 4/5: Input and Output Devices

Introduction to Comparative Study

Effects of Pronunciation Practice System Based on Personalized CG Animations of Mouth Movement Model

COMPARISONS OF TASK EFFICIENCY IN FACE-TO-FACE, DOLBY VOICE TECHNOLOGY, AND TRADITIONAL CONFERENCE SYSTEM COMMUNICATIONS

Under 20 of rotation the accuracy of data collected by 2D software is maintained?

Optimao. In control since Machine Vision: The key considerations for successful visual inspection

AW-SF100 Auto Tracking Software

Interaction of the future. Education Family

Psychology equipment

Chapter 4. Social Interaction.

Effects of microphone arrangement on the accuracy of a spherical microphone array (SENZI) in acquiring high-definition 3D sound space information

Wearable Finger-Braille Interface for Navigation of Deaf-Blind in Ubiquitous Barrier-Free Space

Sony 3D TelePresence. The new dimension.

Until now... every telepresence solution claiming to deliver eye-to-eye contact actually delivers only eye-to-camera contact.

Conference Phone Buyer s Guide

White paper. Sharpdome. Sharp images on every level

Do Handheld Devices Facilitate Face-to-Face Collaboration? Handheld Devices with Large Shared Display Groupware

Geometric Optics Converging Lenses and Mirrors Physics Lab IV

Chapter 8: Perceiving Depth and Size

Whitepaper Education. Video Collaboration for an enhanced Interactive Learning Experience with elearning Suite

Boost the performance of your hearing aids. Phonak wireless add-ons

Microsoft Skype for Business/Lync

Federal Telepresence: Dispelling The Myths

ME 111: Engineering Drawing

High-definition 3D Image Processing Technology for Ultrasound Diagnostic Scanners

Guiding principles for the use of videoconferencing in ESL programming

A Remote Maintenance System with the use of Virtual Reality.

Whitepaper. Image stabilization improving camera usability

Internet Desktop Video Conferencing

Freehand Sketching. Sections

How To Understand General Relativity

A Simple Guide To Understanding 3D Scanning Technologies

Automated Recording of Lectures using the Microsoft Kinect

OPTICAL MEASURING MACHINES

Eye contact over video Kjeldskov, Jesper; Skov, Mikael; Smedegaard, Jacob Haubach; Paay, Jeni; Nielsen, Thomas S.

Transcription:

The Second International Conference on Mobile Ubiquitous Computing, Systems, Services and Technologies A remote multi-sites communication system providing gaze control by body movement Akihiko Yoshida*1 Sadanori Ito*2 and Masaki Nakagawa*3 Faculty of Engineering, Tokyo University of Agriculture and Technology 2-24-16 Nakamachi, Koganei-shi, Tokyo, 184-8588 Japan *1 akihiko.y0412@gmail.com *2 sito@cc.tuat.ac.jp *3 nakagawa@cc.tuat.ac.jp Abstract Various remote communication systems so far developed are still far away from the awareness of face-to-face communication. There may be several reasons. One of them may be due to the fact that even if a participant changes his/her position and posture, the pictures of other participants on the display does not change. Here, we regard a display for remote communication as a real and simulate communication through the real. We propose a method of capturing the position and posture of each the view of the others on their s as if they were talking through the s. This enables each participant to change the view of the other participants by changing his/her position and posture. We designed a remote communication system incorporating this feature, made a prototype and preliminary evaluated the feature. 1. Introduction So far, various remote communication systems have been developed. Videoconferencing by voice and image is getting common due to the proliferation of less expensive cameras and the wide spread of computer networks. However, they are still far away from the awareness of the face-to-face communication. We regard "Face-to-face (f2f) situation" as an ideal condition. In the f2f situation, people are aware of communicating partners gazes, expressions, gestures and even ambient surroundings. Moreover, the range of the view including the partners from each participant changes according to the change of the standing position and posture of the participant. However, the current remote communication systems do not reflect this change and only display the view of partners from fixed positions and angles. We consider that this can be one of several reasons why the current communication systems cannot provide with the awareness of the f2f situation. Here, we regard conversation through a real can be a metaphor of remote conversation by simulating a distant partner as if he/she is standing over the simulated. We call it a virtual. By placing multiple virtual s on a display, we can make remote communication among participants of more than two remote sites. We propose a method of capturing the position and posture of each the view of the partners on their s as if they are talking through the s. This enables the user to change the view of the partners by changing his/her position and posture. The research and development of remote communication considering gaze direction, gaze movement and virtual f2f communication started in 1970's [1]. MAJIC [2] employs a large-scale and curved surface screen around each participant and displays remote participants in full scale on the screen like sitting around a round table. It is one of the researches that avail the effect of f2f. Hyper-Mirror [3] displays the self-image of a talker in the image of the listener with the listener s background and they are enabled to share the space of the listener with the result that they can point to some object in the space and change their gaze direction with being aware by each other. GAZE-2 [4] arranges face images of all the other participants in a 3-dimensional virtual conference room and turns those face images according to their gaze directions with the result that it can convey the gazes among the participants. The GAZE Groupware System [5] looks like GAZE-2. It also employs faces images and a 3-dimensional virtual conference room. Its conference room has a round table and participants can use it as the shared workspace. Although these systems can convey gaze direction and movement, sharing of background is limited to one side, only face images are transmitted and so on. They also require 978-0-7695-3367-4/08 $25.00 2008 IEEE DOI 10.1109/UBICOMM.2008.25 263

large-scale facilities. They are unable to allow participants to talk on something around all the participants, which are common in real situations. We consider that there is another dimension which enhances the awareness of remote communication while it is less expensive so that it can be incorporated into common video conferencing systems and be employed to other applications. We propose a method of capturing the position and posture of each the view of other participants on their s as if they are talking through the s. This enables each participant to change the view of other participants by changing his/her position and posture In this paper, section 2 presents the design, section 3 describes implementation, section 4 present a preliminary evaluation, section 5 discusses remaining issues and section 6 draws conclusion. 2. Design of virtual This method aims to connect remote participants through virtual s and imitate f2f situation where they are talking over real s. Figure 1. Simulation of f2f situation through a virtual. In the f2f situation, there is no sense of incompatibility of size and position of seen objects. Therefore, the system must reproduce them as the same as the real situation. When a person looks into a, the angle and the size of the object seen through the are decided by his/her position (Figure 2). We assume that each participant is looking to the center of the display. Then, the system only has to know his/her eye s x-, y- and z-position and capture image of a distant participant with the background from the center of the display. From these information and image, it can calculate where and how large the image of the distant participant should be displayed. It can change the view according to the participant s movement if it can get the x-, y- and z-position of the participant. person Figure 2. View according to the position of a participant. 3. Implementation of virtual 3.1. Detection of participant s position It is necessary to detect participant's eye position. The eye position detection on the condition of enabling free posture change is difficult, however, under various sources of light. In this research, the head position detection was adopted as approximation. For this system, a participant wears a hat shown in Figure 3 with infrared rays LED and a Wii remote control shown in Figure 4 detects the head position. LED is arranged in the parietal region of the hat at two intervals of 10cm. The Wii remote control with a wideangle lens attached to expand the range of detection is set up 260cm above from the floor as shown in Figure 5. It acquires three-dimensional positional coordinates of the hat with infrared rays LED as shown in Figure 6. 3.2. Formation of view View through a In an arbitrary site, say site A, a network camera is placed at the center of the display. It captures image of the room with a sufficiently wide angle so that it can display the entire room when necessary, i.e., when another participant looks into the room from a very near point from his/her camera. In another site, say site B, from where the above sited is connected, the position of the participant in the site B is censored and the distance and angle form the center of the display in the site B is calculated. Then, the view of the site A form the site B with the detected angle is determined 264

from the above information, the necessary portion of image is cut out from the entire image of the site A and displayed on the screen of the site B with right position and size. The hardware used to achieve this function is shown in Table 1. Figure 3. A hat with infrared rays LED. Machine PC Infrared sensor Network camera Largescale display Table 1. hardware. Type (Manufacturer) Usage VOSTLO 200 (DELL) Execution of program Wii remote User's position tracking control (x-, y-detection range: (Nintendo) 2m-3m) Axis 212 PTZ (Axis) Starboard (HitachiSoft) Wide-range image capturing with frame rate 30fps Displaying remote participants in full scale with the resolution of 1360 768 3.3. How it works Figure 4. Wii remote control. Figure 7 and Figure 8 show the view change according to participant s movement by the implemented system. The system is being used in our laboratory between two rooms to call someone in another room, chat between the two rooms and so on. The use among three rooms is possible but not yet tuned and practiced at this moment. Wii remote control display User wearing the hat with infrared rays LED Figure 5. Position tracking by the Wii remote control. X axis: The amount of the movement parallel to the display. Y axis: The amount of the movement vertical to the display. Z axis: The height of the user s head. Figure 7. View change according to participant s movement in parallel to the display. Figure 6. Three-dimensional detection. far Distance from display near Figure 8. Vie change according to participant s movement perpendicular to the display. 265

4. Preliminary evaluation We made a preliminary evaluation of view change according to user s movement. camera Figure 9. A scene of the evaluation experiment. 4.1. Method of the evaluation We asked 16 subjects, 15 students and 1 teacher, to move freely, communicate whatever they like and test gaze agreement in the following three situations: Situation A: communication through a real frame. Situation B: communication through the virtual with fixed view. Situation C: communication through the virtual with adjusted view. 4.2. Result of the evaluation 4.2.1. Subjective evaluation of each situation. Figure 10 shows the result of evaluating each situation by the subjects into five ranks in respect to the gaze agreement. hands, they pointed out several problems such as Improve the resolution of the image, The camera in front of the screen is annoying or View change was sudden or not smooth. We consider that they are rather technical so that we can revise them. 4.2.2. Effectiveness of view change by participant's position. In the evaluation experiment, when a participant moved to explain something in his/her room, the other participant often moved to see it naturally. This view change was made without explanation by most of subjects. Therefore, we consider that the view change feature by body movement can be effective for remote communication where people can move rather than sitting on fixed places. 4.2.3. Difference by experience. There may be a possibility that users having experiences of video conferencing of fixed viewpoint could make the gaze agreement even without the view change feature by body movement. Therefore, we examined the result according to the participant s experience of this sort of video conferencing. The result was that among the subjects who ranked the situation C as very good or good, the ratio of experienced subjects were almost the same as those in the entire subjects. Therefore, we consider that the proposed feature is accepted regardless of the experience of video conferencing. 4.2.4. Opinions on purposes. For the question For what purpose do you want to use it?, many answered daily communication tools such as Video conferencing telephone, Video Chat and Interphone. Therefore, it is necessary to make the system more small or handy. 5. Future Tasks There remain problems in this research as follows. 5.1. Difference from ideal implementation Figure 10. Subjective Evaluation of each situation. Preference to the situation A is clear while advantage of situation C against B is small. Nevertheless, free descriptions on situation C by the subjects were positive such as I could move so that I could see the partner better, I was moving naturally to better communicate with the partner. On the other Although reproduction of f2f situation as close as possible to the real is ideal, there still some technical problems in the prototype implementation. First problem is as to the fixed position of a camera. If a participant moves his/her position and posture, not just view of connected remote rooms but also angle should be changed. The camera position is fixed, however, so that the angle is fixed and occlusion becomes different from the reality. Figure 11 shows a typical example. A person of object just in front of the 266

camera is seen just from the front rather than other directions even if remote participants move his/her position. People might tolerate this difference or adjust themselves to some extent. We need to examine this incompatibility and evaluate it. If it is serious, we may use multiple cameras and produce view of an arbitrary angle using CV and CG technologies. Otherwise, we may move a camera in front of the display but multiple requests may come to the camera from multiple remote participants. Real [Real] [This System] 6. Conclusion As a reason of lacking awareness in remote communication systems so far developed, we considered the feedback of participant s position and posture into the view of remote participants. Here, we regarded a display for remote communication as a real and simulated communication through the real. We proposed a method of capturing the position and posture of each participant by an infrared sensor and reflecting it into the view of the others on their s as if they were talking through the s. We made a prototype and preliminary evaluation. Although there remains several technical problems, participants were positive to the proposed method and they were enjoying the proposed feature very naturally. Acknowledgments Virtual View through a real and the virtual Figure 11. Difference from the reality due to the fixed position of camera. Second problem is as to the resolution and frame rate of cameras. We are currently employing the 640 480 resolution and 30 fps in order to achieve realtime image processing. Other problems are distortion of image, poor accuracy of position sensing by the commercially lessexpensive Wii remote control, delay in the feedback of view change and so on. 5.2. Employment of voice We made the prototype by only considering video and neglecting audio. Although we simply employing a single microphone at each site, adjusting position of voice according to participant s position using multiple microphones is also possible and promising. The research on stereophony is active so that the combination with the stereophony seems feasible and effective. We thank the members of the electronic organ circle of our university for participating in the evaluation experiment. We thank Tatsuya Terada and Akihito Kitadai for various advices. This research is being partially supported by Grant-in-Aid for Scientific Research under the contract number 16650015 and the MEXT Research and Education Fund for Promoting Research on Symbiotic Information Technology. References [1] Negroponte, N., Being Digital, Vintage Books, 1995. [2] Kenichi Okada, Fumihiko Maeda, Yusuke Ichikawa, and Yutaka Matsushita, Multiparty Video conferencing at Virtual Social Distance: MAJIC DESIGN, CSCW 94, 1994, pp. 385-393. [3] Osamu Morikawa, Juli Yamashita, Yukio Fukui and Shigeru Sato, The relationship between arrangement of participants and comfortableness of communication in HyperMirror, Interaction 2001, 2001, pp. 179-186. [4] Roel Vertegaal, Ivo Weevers and Changuk Sohn, An Attentive Video Conferencing System, CHI 2002, 2002, pp. 20-25. [5] Roel Vertegaal, The GAZE groupware system: mediating joint attention in multiparty communication and collaboration, CHI 99, 1999, pp. 294-301. 267