LightGuide: Projected Visuliztions for Hnd Movement Guidnce Rjinder Sodhi 1,2 Hrvoje Benko 1 Andrew D. Wilson 1 1 Microsoft Reserch One Microsoft Wy Redmond, WA 9852 {enko, wilson}@microsoft.com 2 Deprtment of Computer Science University of Illinois 21 North Goodwin Avenue, Urn, IL 6181 rsodhi2@illinois.edu c d Figure 1. An overview of the rnge of 3D cues we creted to help guide user s movement. In (), user is shown 2D rrow with circle tht moves in the horizontl plne, () shows 3D rrow, (c) 3D pth where lue indictes the movement trjectory nd (d) uses positive nd negtive sptil coloring with n rrow on the user s hnd to indicte depth. ABSTRACT LightGuide is system tht explores new pproch to gesture guidnce where we project guidnce hints directly on user s ody. These projected hints guide the user in completing the desired motion with their ody prt which is prticulrly useful for performing movements tht require ccurcy nd proper technique, such s during exercise or physicl therpy. Our proof-of-concept implementtion consists of single low-cost depth cmer nd projector nd we present four novel interction techniques tht re focused on guiding user s hnd in mid-ir. Our visuliztions re designed to incorporte oth feedck nd feedforwrd cues to help guide users through rnge of movements. We quntify the performnce of LightGuide in user study compring ech of our on-ody visuliztions to hnd nimtion videos on computer disply in oth time nd ccurcy. Exceeding our expecttions, prticipnts performed movements with n verge error of 21.6mm, nerly 85% more ccurtely thn when guided y video. Author Keywords On-demnd interfces, on-ody computing, pproprited surfces, trcking. ACM Clssifiction Keywords H.5.2 [Informtion interfces nd presenttion]: User Interfces. Input devices & strtegies. Permission to mke digitl or hrd copies of ll or prt of this work for personl or clssroom use is grnted without fee provided tht copies re not mde or distriuted for profit or commercil dvntge nd tht copies er this notice nd the full cittion on the first pge. To copy otherwise, or repulish, to post on servers or to redistriute to lists, requires prior specific permission nd/or fee. CHI 12, My 5 1, 212, Austin, Texs, USA. Copyright 212 ACM 978-1-453-115-4/12/5...$1.. INTRODUCTION When performing gestures tht re intricte or tht require gret del of technique, physicl feedck from n instructor cn often e useful for performing movement. For exmple, when someone wnts to perform the proper technique for weight trining exercise, n instructor often gives instntneous feedck y grdully correcting the position of the user s ody through physicl touch. While this exchnge seems crucil, the vilility of such resource disppers when user is no longer in the presence of n instructor. Insted, directing humn movement is usully ccomplished through video recordings, digrms, nimtions, or textul descriptions. We rely on evy of online resources tht include detiled grphicl imgery or do-it-yourself videos (see Figure 2). However, without incrementl nd rel-time feedck, interpreting nd following set of movements cn still e chllenge. In this pper, we explore n lterntive pproch to movement guidnce where ody movement cn e directed using projected visul hints. Our system, LightGuide, provides users with rel-time incrementl feedck for movement guidnce tht is projected directly on their hnd (see Figure 1). LightGuide provides unique enefit to existing gesture guidnce methods: users cn focus their ttention directly on the ody-prt rther thn divide their ttention etween video screen nd the movement. Users cn move their ody-prts freely in spce, relesing the user from lwys eing orientted towrds video screen. All our system requires is projector nd depth-sensing
cmer. While our system does not require user to e physiclly instrumented with device, these hints cn lso e used in ody-worn [12,13] or on limited-screen spce hndheld devices, such s smrtphones. Thus, our work provides three primry contriutions: First, we introduce series of unique visuliztions for movement guidnce tht incorporte feedck nd feedforwrd cues. Second, we contriute prototype system, LightGuide, which is comprised of single overhed projector nd depth-sensing cmer to sense the user nd their movements. Our proof-of-concept system fcilittes the disply of our visul hints on user s ody nd llows us to reply pre-recorded or system-generted 3D pths t userdriven pce or dynmiclly controllle speeds. Finlly, we show results of our quntittive comprtive evlution, qulittive user feedck nd discuss the pros nd cons of our pproch. MOTIVATION We cn envision numer of prcticl pplictions tht leverge on-ody projected hints for guidnce. For exmple, imgine n mteur thlete working on punching exercises during mrtil rts trining. With projected hints, the system cn direct the user towrds the optiml rech of the rm to ensure tht the shoulder is not overextended to cuse injury. In nother exmple, physicl therpy ptients recovering from n injury cn e guided through prcticing exercises t home. Novice musicins lerning to ply n instrument cn y directed to the correct posture when their form egins to drift. We elieve tht ll of these movements cn e guided with correct sptilly registered projections on user's ody. RELATED WORK Our on-ody projection pproch drws from vriety of fields, including computer-ided instruction, ugmented relity, nd projection-sed guidnce. Here we focus on the relevnt relted work from these res to position nd understnd the contriutions of the present work. Computer-Aided Tsk Guidnce Receiving tsk guidnce through computer-ided instruction hs een reserch focus for decdes. Demonstrtion through nimted-videos hve shown tht computer-sed instruction cn improve tsk performnce, prticulrly in ssemly sed tsks [31]. Plmiter nd Elkerton showed tht well plced textul hints with nimted videos cn lso give immedite enefits for tsk performnce [23]. Others hve explored dding grphicl visul hints to video in post to help users explore rnge of dnce movements [29]. While these hints re stticlly plced in video, previous literture hs lso looked t using co-locted rel-time feedck nd feedforword mechnisms to provide on-demnd ssistnce to guide users through gesturl tsks [3,11]. Such systems led users in-situ, while user is in the process of performing the gesture. Our work drws upon this prior reserch where we explore how co- Figure 2. Exmples of how people currently follow instruction for movement (e.g. Kinect virtul vtr, stretching, rhythmic dnce nottion [4]). locted projection-sed on-ody hints cn help show similr improvements for movement tsks. Tsk Guidnce in Augmented Relity The field of Augmented Relity (AR) hs shown numer of methods to provide guidnce y using hed-mounted displys or moile devices to convey instructions tht re superimposed on virtul or rel world video feeds [1,8]. Feiner et l. explored using AR to guide users through repiring lser printers. More recently [21,22], AR hs een demonstrted for vriety of tsks, such s plying the guitr or mnufcturing. In tngile AR, White et l. explored using vriety of grphicl representtions through ghosting to enle the discovery nd completion of gestures [32]. While these pproches re promising, hed-mounted displys cn e cumersome for users to wer nd diminutive screens cn constrin the user experience. Augmenting Environments with Projectors Recent dvncements in projection technology hve mde it possile to imue user s environments with projection cpilities [24,26,27,33]. For exmple, Wilson nd Benko explored using series of depth-sensing cmers nd projectors to trnsform room-sized environment to enle un-instrumented surfces (e.g., desk) to emulte interctive displys. In ddition, the emergence of miniturized projection technology hs opened up the possiility of ppropriting the user s ody s disply surfce where even user s hnd lone contins more surfce re thn typicl smrt phone [12,13]. In ddition to ody-worn projection systems, hndheld [5,6] nd hed-mounted projectors [15] lso llow users to e moile, without requiring their environments to e permnently instrumented with projectors nd cmers [18,2,25]. All of these projection-sed pproches re similr to our pproch using depth-sensing cmer for trcking nd projector to turn n ritrry surfce into n interctive disply. Projection-Bsed Guidnce The use of projection-sed ugmented relity AR for guiding users through tsks hs een reserch vision in recent yers [1]. Kirk et l. looked t using projection in remote collortive scenrios (e.g., remote Lego uilding) with rel-time projection guidnce co-locted next to the user s hnd on sttic desk [19]. Similrly, Rosenthl et l.
found tht comining sttic text nd pictoril instructions on screen with micro-projection sed guidnce on physicl ojects improved overll tsk performnce [28]. In contrst to prior literture, we explore projecting visulhints with rel-time feedck directly on the user s hnd tht is trcked in mid-ir for movement guidnce. DESIGN CONSIDERATIONS FOR GUIDING MOVEMENT To provide in-situ guidnce for the user s movement, visul hints need to convey sense of where to move next. We re motivted y the ide tht one cn co-locte the instruction for the movement with the ody prt tht needs to e moved long desired pth. To inform the design of such hints nd the vlidity of the overll pproch, we focused this work on projected hints on the user s hnd s it moves freely in spce. We elieve tht our pproch llows the user to focus their ttention on ody prt nd the movement itself. Through our initil explortion s well s leverging prior literture, we highlight six criticl spects tht need to e considered when designing on-ody guidnce hints: (1) feedck, (2) feedforwrd, (3) scle, (4) dimension, (5) perspective nd (6) timing. Feedck Feedck components provide informtion out the current stte of the user during the execution of the movement. This feedck cn come in the form of user s current position, the pth the they took (e.g., [3]), or their error or devition from the pth, to nme few. For exmple, with position, the feedck cn either e relyed to the user in reltive sense (e.g. user s projected progress long movement pth) or in n solute sense (e.g. user s solute devition from movement pth). Feedforwrd Feedforwrd components provide informtion to the user out the movement s shpe prior to executing the movement. As descried in [3,11], the feedforwrd cn come in the form of showing the user where to go next, segment of the movement pth hed, or simply show the user the entire movement pth. One possile downside for showing the whole movement is for sufficiently complex pths, pth self-occlusions my ostruct user s view of where to move next. Scle To gin insight into how to convey scle, we consider Steven s Power Lw which descries reltionship etween mgnitude of stimulus (e.g., visul length, visul re, visul color) nd its perceived intensity or strength (projected line, projected circle, projected intensity) [2]. Tht is, the reltionship llows us to understnd how users perceive visul cues, e.g., the re of circle, color, or the length of line, nd descries how well they convey the scle of movement (e.g., wht is the distnce I should move my hnd to get from point A to point B when projected line denotes distnce versus using re or color to denote distnce?). Dimension As found in [17] the wy in which the user perceives the structure of the tsk gretly ffects their performnce for high-dimensionl input. As such, how we convey where to move in three-dimensions depends on how intuitive the user finds the visul hint. For certin users, the most intuitive wy to get from point A to point B my e in the form of visul hint tht is roken down into two distinct components, e.g. where to move horizontlly nd verticlly. In contrst, for others, single metphor hint my e the most perceptully intuitive, e.g. go from point A to B ll in one simultneous tsk. Perspective One spect of conveying n on-ody visul hint is to explore egocentric nd exocentric viewpoints [3] (e.g. first person nd third person perspective, respectively) With n egocentric viewpoint, we wnt users to get greter sense of presence where the hints ecome nturl extension of their odies, reinforcing guidnce y tugging the user s hnd long the movement pth. In contrst, with n exocentric viewpoint, rther thn seeing guided hints emodied in the user, they re seen t n overview (e.g., video). Timing In our design of n on-ody visul hint, we feel tht there re two min pproches tht my effectively communicte timing in motion: system imposed timing nd self-guidnce. For system imposed timing, users follow visul hint tht is displyed t system specified speed. A visul hint cn convey rnge of dynmics, such s in keeping the speed constnt or chnging it dynmiclly throughout the movement. For self-guidnce, the user cn see visul hint nd choose the pce t which they rect to the hint. LIGHTGUIDE PROJECTED HINTS We descrie set of visul hints tht follow importnt spects of the design spce we hve highlighted. Our visul hints cn e used to help guide user s movement in ll three trnsltionl dimensions. To our knowledge, this is the first implementtion of on-ody projected hints for reltime movement guidnce. While this is rther lrge design spce with mny possile solutions, our itertive design process included n nlysis of 1D, 2D nd 3D visul hints nd offers set of compelling solutions tht cn inform future designs. We focus our descriptions on the finl hint design which resulted from our itertive process, ut encourge the reder to see the ccompnying video for more complete reference of lterntives. In this initil explortion we hve chosen to focus nd verify our ides y tckling hnd trnsltion first (i.e., movements in the x, y nd z dimensions), without ny rottions of the hnd. As such, we leve visul voculry for 3D rottions to future work. Follow Spot The Follow Spot cn e seen in Figure 3()-(). Through our initil pilots, we found the most intuitive metphor for
Positive Negtive Positive Negtive c d Figure 3. In ()-() the Follow Spot shows user white circle nd lck rrow reduces in size when user moves their hnd up, (c)-(d) the Hue Cue shows positive coloring (lue) which represents the direction user should follow horizontlly while moving wy from the negtive coloring (red). users ws to use 1D visul length (e.g. distnce), which is reflected in the mpping specified y Steven s Lw [2]. To specify feedck in depth, the 1D rrow points wy from the user to signl moving up nd points the rrow towrds the user to signl moving down. The size of the rrow dicttes the distnce to the trget depth position communicting the scle of the movement. Tht is, s the user moves up in the z-direction to hit trget depth s specified y lrge lck rrow pointing wy from the user, the tip of the rrow decreses in size until it ecomes lck horizontl line. The visul hint otherwise contins no feedforwrd mechnism. Hue Cue We crete visul hint tht utilizes negtive nd positive sptil coloring to indicte direction nd the spce user should occupy, shown in Figure 3(c)-(d). The cue uses comintion of sptil coloring in x nd y nd depth feedck in z to guide user s movement in three dimensions. The feedforwrd component is conveyed in the positive coloring, shown in lue nd the negtive coloring for feedck in red. To perform the whole movement, user cn continuously move towrd the lue nd wy from the red. In order for user to see if they re moving t the correct depth, Follow Spot hint is projected in the middle of the hnd. 3D Arrow We crete more direct mpping to visulize direction y conveying simple 3D Arrow to the user, shown in Figure 4()-(). The enefit of using 3D Arrow is tht direction for ll three dimensions, x, y nd z cn e conveyed in single metphor. Additionlly, to engge the user s egocentric viewpoint, we render the 3D Arrow from the user s perspective nd dd shding to emphsize its 3D shpe. 3D Pthlet We crete 3D Pthlet metphor where users re shown smll segment of the pth hed in the movement. This visul hint llows users to see segment of the pth, denoted in lue in Figure 4(c)-(d) s form of feedforwd. The red dot provides users with their reltive position, projected on the movement pth. The enefit of the 3D Pthlet is tht users cn see chnges in direction of curved motions long the pth well efore they execute the movement. Figure 4(c)-(d) shows user completing movement shped in the form of the lphet letter N Figure 4. In ()-(), the 3D Arrow is shown pointing down nd up, (c)-(d) the 3D Pthlet, shows the user (red dot) smll segment of wht is hed in the pth (denoted in lue). displyed t 45-degree ngle. Additionlly, for perspective, similr shdow is used to emphsize the 3D Pthlet s shpe. As shown in Figure 5, when the user distorts their hnd significntly, the 3D illusion is diminished. Movement Guidnce Algorithm LightGuide cn reply ny pre-recorded movement (e.g., recorded with depth sensor) or idel generted pth (e.g., prmetric wve). For pth, we summrize our lgorithm (Figure 6) s follows: The pth is first pre-processed into segments, where segment is composed of two points in the order with which we wish to guide the user. The pth is then trnslted to the user s current hnd position where the visul hint is rendered. As the user follows visul hint, ny devition from the pth cn result in n solute, relxed-solute or reltive projection (Figure 6()-(c)). The user continues through the pth using one of these three pproches until the pth is complete. The solute projection results in visul hint tht immeditely guides the user ck to the movement pth once devited, the relxed-solute movement slowly guides the user ck to the movement pth nd the reltive projection simply shows the user the next direction of the movement without requiring the user to e directly on the pth. Ech projection type is tsk dependent. For exmple, dncing movement my e less stringent out following the exct pth nd could thus use reltive projection. In contrst, n exercise movement where user cn potentilly strin muscle if done incorrectly my use n solute or relxed-solute projection. Bsed on our initil pilots, we chose to hve the Follow Spot use n solute mpping, the Hue Cue to hve reltive mpping in x nd y nd n solute mpping in z, while the 3D Arrow nd 3D Pthlet use reltive mpping. Shdow c d Figure 5. In (), the 3D Pthlet cretes n illusion of the pth extending eyond your ody where the shdow emphsizes the 3D nture of the hint. In (), the illusion is diminished when the user s hnd is orientted t n extreme ngle.
1 3 Depth-Sening Cmer Projector Pth Segments Current User Position Current Segment 2 Current Segment User Strt Position Current User Position c Current User Position Current Segment Current Segment Figure 6. Our lgorithm first reks down the pth into smller segments. The pth is trnslted to the user s current hnd position nd the visul hint is rendered to egin guiding the user. When the user devites from the desired pth, the visul hint cn, () direct the user ck to the closest point on the pth, () incrementlly ring the user ck to the pth, or (c) guide the user through reltive movement. Dynimcs For system imposed timing, LightGuide cn reply the visul hint so tht it follows the movement pth utomticlly in spce t ny speed. To ensure tht the visul hints do not move off of the user s hnd, we followed the sme procedure s [12] in which we compute derivtive mp of the depth imge to check for lrge chnges in the oundries t the contours of the hnd. Tht is, if the visul hint reches the contour of the hnd, it stops moving until user hs dequtely cught up to the pth. For the self-guidnce pproch, the system relies on the user to direct themselves through the movement. A visul hint descries the motion trjectory through feedck nd feedfowrd cues nd user cn choose their own pce. LIGHTGUIDE IMPLEMENTATION Our proof-of-concept LightGuide system, seen in Figures 7, consists of two primry components. First is commercilly ville Microsoft Kinect Depth Cmer, which provides 64x48 pixel depth imges t 3Hz. The second component is stndrd off the shelf InFocus IN153 wide-ngle projector (128x124 pixels) [16]. The depth cmer nd projector re oth rigidly mounted to metl stnd positioned ove the user. This ensures tht we could dequtely see the user s hnd motions s well s to ensure tht our projected visul hints would fully cover the user s hnds. The visul hints re rendered from fixed perspective tht ssumes user is looking down 45-degree ngle towrds their hnd. While occlusion (prticulrly self-occlusion) is fundmentl prolem with ll projector-cmer systems, we do not feel tht this plyed significnt role in users interctions. In the future, multiple projectors nd cmers cn e used to help reduce the effects of occlusions on more complex unconstrined movements. Figure 7. () LightGuide uses single projector nd depth sensing cmer, () the projector nd depth cmer re fixed over the user s ody. Projector Cmer Clirtion For the visul hints to e correctly projected on user s hnd, we must first unify the projector nd cmer into the sme coordinte spce. We clirte our projector to the depth cmer s the cmer lredy reports rel world coordintes (mm). The intrinsic prmeters of the projector cn e modeled using the digonl field of view nd the center of projection. To compute the extrinsic prmeters, we require four non-coplnr correspondences etween points tht cn e seen in the depth cmer nd projector. Once we estlish correspondences etween the 2D points of the projector nd the 3D points of the cmer, we use the POSIT lgorithm [7] to find the position nd orienttion of the projector. Hnd Trcking The prototype system first trnsforms every pixel in the input imge into world coordintes nd then crops pixels outside of volume of 1 cuic meter. This removes the floor, wlls nd other ojects in the scene (e.g. desk). The prototype then identifies the user s rms y determining continuous regions long the depth imge. The system then finds the frthest point long the entire rm y trcing through the continuous region eventully reching the most distnt point long the hnd. To extrct the user s hnd, we ssume constnt hnd length [14] which worked well in our tests. A distnce trnsform [9] is then used on the resulting imge nd the mxim is ssumed to e the center position of the hnd. USER STUDY The purpose of this study ws to demonstrte the fesiility of our pproch nd to determine if our prototype is cple of guiding user s hnd in mid-ir. Specificlly, we wnted to know how ccurtely users follow on-ody projected visuliztions. We lso wnted to investigte how the ccurcy nd ehvior of user chnges for pths t vrying depth levels. In ddition to following, we lso explored the ccurcy nd speed of self-guided movements where users dictte their own pce of movement. To plce LightGuide s performnce in context, we compred our method to video s we felt it ws representtive of resource tht users currently utilize. The video condition, shown in Figure 8, is comprised of 3D
1 45 o 9 o 2 o 3 4 Figure 8. A rendering of the 3D hnd tht is used in our video condition. The motion is n rc tht moves towrds the user nd grdully increses in depth. model of hnd tht follows n idel, system-generted pth. Although our nimted video does not provide nerly s much visul context to prticipnts s rel life video, system controllle video llowed us to remove the effects of ny humn or trcking error tht could ffect the movement pths. More importntly, the nimted video llowed us to control the perspective of the video (e.g. rendered from the user s perspective) s well s precisely control the speed nd timing of replyed movements. While we feel tht the est performnce with our system cn e ttined y using oth video nd on-ody hints, our comprison independently mesures the effect of our visul hints nd video for movement guidnce. Prticipnts We recruited 1 right-hnded prticipnts from our locl metropolitn re (2 femle) rnging in ge from 18 to 4. All prticipnts were screened prior to the study to ensure their rnge of motion ws dequte to perform our tsks. The study took pproximtely 9 minutes nd prticipnts received grtuity for their time. Test Movements Our gol ws to support interctions on vriety of movements. For our user study, we included five different pths: line which must e trced ck nd forth, squre, circle, n N, nd line plus curve (Figure 9). These pths shre similr chrcteristics to the types of movements ptients re sked to perform in physicl therpy sessions (see Motivtion). The pths, seen in Figure 8, rnge in length from 3 to 63mm (men = 438.1 mm, SD = 13.6mm). To ensure tht we dequtely tested vriety of depth levels, we vry the pths t three different ngles:, 45 nd 9 with respect to the horizontl plne in the prticipnt s frme of reference. Procedure During the experiment, prticipnts were instructed to stnd t comfortle position underneth the overhed projector nd depth-sensing cmer. Prior to strting, we verified tht Figure 9. In () the test pths used in our study, () ech pth is oriented t, 45 nd 9 (only circle pth is shown). ech prticipnt hd enough room to move their hnd while eing dequtely trcked y the system. The primry tsk consisted of prticipnt moving their hnd in spce following specific hnd guidnce visul hints. By following, we men tht visul hint would egin moving in spce t speed of 3 mm/sec. nd prticipnts would follow the hint nd respond to its cues. Our choice of 3mm/sec for visuliztion speed ws chosen through informl pilot studies tht hd users try out vriety of speeds. 3 mm/sec ws chosen to e the most comfortle constnt speed while still producing resonle hnd motions. To quntify how users perform movement t their own pce, secondry tsk ws included where the sme 3D Arrow ws used without ny system imposed timing. Tht is, the 3D Arrow would only chnge position if the user responded to the direction indicted y the 3D Arrow. We refer to this s self-guided. We performed within sujects experiment nd in totl, we tested 6 visul hints: Follow Spot, 3D Follow-Arrow, 3D Self-Guided Arrow, 3D Pthlet, Video on Hnd, nd Video on Screen. Here on, we refer to our two 3D Arrow conditions s 3D F-Arrow nd 3D SG-Arrow. All except the Video on Screen condition were projected on the prticipnt s hnd. Our seline Video on Screen condition ws shown to prticipnt on computer monitor situted directly in front of the user. Importntly, prticipnts were told to keep their hnds flt (fcing down) during the entire experiment to ensure tht the visul hints would consistently pper on their hnds etween trils s well s to ensure consistent hnd trcking performnce y our system. To provide consistent strt loction for ech movement, we mrked the desired strting hnd loction with mrkers on the floor in front of the prticipnt nd sked them to return to the mrker efore eginning ech new tril. In ech tril, prticipnts were instructed to hold out their hnd nd follow the guidnce cues completing single pth s ccurtely s possile. We sked the prticipnt to keep the visul hint t the center of their hnd. Once the pth ws completed, the system would sound chime nd red circle would pper on the prticipnt s hnd signling the user to return to the strt position. In totl, prticipnts were sked to follow single visuliztion over our 15 test pths;
Error (mm) 3 25 2 15 1 5 Line Line 45 Line 9 Circle Circle 45 Circle 9 N N 45 N 9 Line + Curve Line + Curve 45 Line + Curve 9 Squre Squre 45 Squre 9 Follow Spot 3D F-Arrow 3D G-Arrow 3D Pthlet Video Hnd Video Screen Figure 1. Overll distriution of unscled devitions from pth. The circles denote users while colors show the 15 unique pths. presenttion order ws rndomized. The procedure ws repeted for ech of our conditions. Before ech mesurement phse, prticipnts were llowed to prctice using the visul hints to move through pth. Ech condition lsted pproximtely 1 minutes, of which 5 minutes ws used for prctice nd 5 minutes for mesurement. Between conditions, we llocted 5 minutes for prticipnts to rest in order to reduce the effects of hnd ftigue. Ech session produced 9 trils (6 conditions x 5 pths x 3 ngles) per prticipnt. To counter-lnce the conditions, the presenttion of ech condition ws rndomized to remove the effects of ordering. Users were interviewed fter ech session followed y short post-study interview. We recorded video of the prticipnts nd mesured their position, hnd-orienttion nd time. RESULTS We seprte our nlysis into two components: Movement Accurcy nd Movement Times. Our 1 prticipnts produced totl of 9 movement trils on 15 unique pths. During the study, we experienced only single type of outlier relting to the trcking of user s hnd. The trcking results would chnge depending on if the user would self-occlude their hnd (e.g., rotte towrds the principl xis of the cmer). Additionlly, we experienced 21 trils (2%) where users would len their odies into the cpture volume, leding to momentry erroneous hnd mesurements tht would only pper in the outer extents of the cpture volume. The erroneous mesurements in the outer extents were filtered in post-dt nlysis llowing us to use ll tril mesurements in our finl nlysis. Movement Accurcy We tke two-fold pproch on mesuring the ccurcy of movements: devition from the pth nd fit (e.g., see [14]). In oth cses, to determine ccurcy, we use the solute Euclidin distnce from the closest point s n error metric. As in prior literture [12], we highlight two sources of systemtic error: 1) non-linerity nd improper clirtion of the projector nd cmer (e.g., the loction of the projected visuliztion differs from where the cmer expects it to e) nd 2) inccurcy in the hnd trcking, especilly when the user s hnd egins to leve the cpture volume. Overll, we found smll glol systemtic offset etween the cmer nd projector where the verge X- offset cross users ws 9.2mm to the left of pth nd Y-offset of 1.5mm elow pth, which is in greement to findings in previous literture [12,14].We did not pply these glol X/Y offsets, s prticipnts would compenste for the system inccurcy in the following conditions y moving their hnd until the visuliztion ppered t the center of their hnd. In the self-guided condition, the loction of the 3D G-Arrow ws sufficiently well plced in ll our trils so tht prticipnts could see the visul hint. Movement Devitions We nlyzed the verge devitions of users cross ll pths nd visuliztions using their rw, unscled, distnces to the closest point on the pth, (see full distriution in Figure 1 nd single user s performnce in Figure 11). Using stndrd ANOVA, we found tht there ws significnt difference etween our visul hints (F [5, 894] = 276.5, p <.1). A post-hoc Bonfferroni-corrected t-test on the Follow spot nd 3D F-Arrow performed significntly etter thn oth video conditions with verge devitions of 24.6mm (SD = 9.mm) nd 49.9mm (SD = 29.17mm) respectively (t 16 = 25.6, p <.1, t 26 = 122.5, p <.1). Additionlly, the distriution highlights the difficulties users hd in perceiving scle for our nimted videos. Surprisingly, Bonfferroni-corrected t-test compring the ccurcy of our video conditions show tht significntly smller devitions cn e chieved y showing n identiclly rendered video, on the user s hnd (t 56 = 93., p <.1). Movement Shpe Although the unscled distriution in Figure 1 shows tht our users were not le to chieve the desired scling on pth with the video screen condition, the results do not explin how well users do t performing the shpe of the movement. To help nlyze shpe, we use the Itertive Closest Point (ICP) lgorithm to register the user s movements to our model pths [34]. With ICP, we hve the flexiility of rotting, trnslting nd scling n oject in ll three xes to find the est mtch. For our purposes, we
1 8 6 4 2 1 4 35 3 25 2 15 1 5 15 1 2 1 1 1 8 6 4 2 8 3 25 2 15 1 5 12 1 4 3 2 1 3 2 14 12 1 8 6 4 2 15 1 6 4 2 6 1 Figure 11. A single user s performnce on pths oriented t 45 using the Video Screen (top) nd Follow Spot (ottom row) visul hint. The ground truth is denoted in lck nd the user s movement is shown in red. Axis units re in mm. exclude rottion from our ICP trnsformtion s our pth s unique chrcteristics re defined y their ngle of rottion. Tht is, we wnted to see how well users perceived ngles in video nd excluding rottion llowed us to nlyze devitions from ngled motions. Figure 12 shows results on the chnge in devition when user s pth is scled nd trnslted with ICP. On verge, prticipnts using the video screen condition devited from the desired pth y 25.1mm (men SD = 7.3mm), while the video hnd condition fired comprly. Prticipnts using the Follow Spot condition showed significntly less devition t 13.7mm (men SD = 6.6mm) (t 16 = 11.4, p <.1). Additionlly, our results indicte tht there ws significnt performnce difference in orienttion of the pths in the video screen condition (F [2, 147] = 24.6, p <.1). On verge, prticipnts performed ngled movements with n verge devition of 43.2mm (SD = 9.3mm), pproximtely 4% less ccurtely thn flt or verticl movements. Movement Times We rek down movement mesurements into two components: self-guided times for the 3D SG-Arrow compred to the video conditions nd distnces hed or ehind pth for ech of our visul hints. 2 15 1 5 2 1 8 7 6 5 4 3 2 1 6 4 15 1 1 Figure 12. The Itertive Closest Point Algorithm is used to nlyze the performnce of user s shpe. A user s movement is trnslted nd scled itertively until their motion converges to the idel pth. Error rs encode stndrd error of the men. Figure 13. () Prticipnts movement times were nlyzed in the 3D SG-Arrow condition nd compred to video on hnd nd video on screen nd () shows prticipnt s verge distnce ehind ech projected visuliztion. Self-Guided Times The verge movement times cross ll users nd pths for the 3D G-Arrow, video screen, nd video hnd re visulized in Figure 13(). With the 3D SG-Arrow, lthough prticipnts were le to perform the movements with more ccurcy over oth video conditions, movement times for video were significntly fster (F [2, 447] = 54.9, p <.1). On verge, prticipnts performed video screen movements with men of 3.45s (SD = 1.67s), nerly twice s fst s the 3D SG-Arrow. These results reflect our oservtion tht prticipnt s tendencies were to first see the whole pth conveyed on video, where users cquire the gist of the entire movement. In contrst, users with the 3D SG-Arrow would perform movements in situ, figuring out direction s they moved long the pth. Distnce Ahed/Behind Pths Figure 13() displys the verge distnce (mm) prticipnts were in front, or ehind ech of the visuliztions in the following conditions. To illustrte how prticipnts follow 3D F-Arrow, Figure 14 displys single prticipnt s movement on circle tht is oriented t 45 degrees with the respect to cnonicl horizontl X-Y plne. User Feedck In the video condition, users were le to quickly perform movements, ut often expressed frustrtion with the lck of Follow Spot 3D F-Arrow 3D Pthlet
Figure 14. The plots show the sme movement from two perspectives of how fr in-front or ehind user s hnd is compred to the projected visul hint. Green denotes prticipnt s pth, the red line shows the ctul position of the hint nd the lue line shows the projected point on the pth. feedck. As one prticipnt descried for video, It ws hrder to reproduce sutle movements, then to follow. It ws lso hrder to judge elevtion sed on the size of the hnd. Importntly with video, users lso descried the lck of feedforwrd hints. As one prticipnt sid, With video, you hve glol fetures. You just never know wht s coming next. With the Follow spot visuliztion, users commented on the generl ese of understnding of the visuliztion. For exmple, s prticipnt explined, The circle one ws simplest, it ws only telling you up or down. Less displyed info mde it esier. Similrly, nother prticipnt noted, For me, the est visuliztion ws proly the circle with the rrow, s once I ws used to the mechnics of it, it ecme somewht second nture. With the 3D Pthlet, users commented on the enefits of knowing wht ws coming up hed in the movement. As prticipnt descried, The feedck ws gret nd I liked seeing where I ws going. Although occlusions were not prevlent in ll pths, users occsionlly commented on disppering red ll. As result, prticipnts would tend to overshoot pth, s they were unle to see how much of the pth they hd consumed. Following nd Leding A mjority of our users in our interviews (8/1) sid they preferred the 3D SG-Arrow over ll other visuliztions. The ility for users to shpe their own tempo ws importnt to their overll stisfction with the visuliztion. As prticipnt noted, Creting my own tempo mde it esier to concentrte on where I ws moving. Another prticipnt descried, If I go fster, I feel like I cn do it etter. Moving t my own speed lets me concentrte on wht the system wnts me to do. Becuse it s recting to me, I cn focus on the shpe of the pth. I didn t hve to follow slower system when I could do etter. DISCUSSION The results of our study support our pproch of guiding users movements with on-ody projections in tht users were le to perform more ccurtely with our system over video. Our resoning for using n nimted video ws to dequtely control for perspective, trcking error nd speed. However, we hypothesize tht the reson for the lrge difference in scle ws tht we only provided users with single shdow on white horizontl plne for visul context. Thus, more representtive mesure of ccurcy with our hints cn e seen in our nlysis with ICP, where the Follow Spot visul hint did significntly etter thn ll other conditions. In generl, users qulittive feedck lso reflected our empiricl findings where generl comments positively reflected the ese of understnding the hint. However, this ccurcy comes t cost. With our 3D SG- Arrow, users were le to ccurtely guide themselves through pth, ut were unle to do so t the sme speed s video. One reson for this ehvior my e ttriuted to users eing le to see the movement efore completing the pth, getting the gist of the motion. Although we hve highlighted scenrios where users re no longer in the position to look t video screen (e.g., when they re using ody-worn projection systems), more eneficil scenrio for on-ody projections my e ttined when comined with video. One strtegy users cn tke is to view the gist of the motion nd see the visul context in video, nd then use our on-ody hints to perfect the motions. Our findings lso showed tht when the exct sme video ws moved from the screen to the hnd, there ws significnt performnce difference in scle. While this my e ttriuted to the lck of visul context, or chnge in scle (e.g., the video on the hnd ws smller), nother possiility could e tht users were le to more ccurtely clirte for the desired movement of their hnd when the video ws rendered in the sme loction s the physicl ody-prt we were ttempting to guide. Surprisingly, t times users would ecome so immersed in the visul hints tht it ecme uncler to them whether they were moving their hnd or if it ws the visuliztion tht ws moving in spce. This rection reflects similr findings in previous literture [27] where projected light ws used to trick viewers into thinking sttic cr ws moving long rod. Among the visul hints, the 3D F-Arrow showed the most promising ehvior with regrd to user s consistently keeping pce with system imposed timing. Tht is, y simply conveying sense of the next point long the pth through direction lone, users were le to more ccurtely predict where to move next. Our user opinions suggest tht when the tsk is not rhythmic in nture or requires fixed speed/ccurcy, the most enefit my e otined y llowing the user to dictte their own pce. This llows users to hve the flexiility to decide how they wnt to interpret nd rect to the visul hints. CONCLUSION AND FUTURE WORK In this pper, we descried nd evluted four on-ody projected visul hints to help guide user s movement in mid-ir. In ddition, we introduced LightGuide, proof-of
concept system tht uses n overhed projector nd cmer to disply our visul hints which cn reply movements t user guided or system imposed speed. Our results suggest tht users cn follow our on-ody hints ccurtely for movement of single ody prt in spce nd cn do so t system controlled speed. While our chief gol with the present work ws to demonstrte tht on-ody projected hints could e used for movement guidnce, we hve only tested these visul hints on user s hnd. For exmple, these visul hints could just s esily e shown on the rest of your rms, torso nd legs ssuming the rest of the user s ody is trcked (e.g. with the Kinect skeletl trcker). Furthermore, we hve yet to explore guidnce of two or more ody prts (e.g. two hnds) simultneously. Another importnt question moving forwrd is how to dpt or dd rottionl visul hints to llow guiding user s full rnge of motion. Finlly, there re mny fscinting cognitive questions we would like to investigte. For exmple, does projecting the sme visul hint on screen llow the user to perform the movement just s ccurtely s projecting the hint on your ody? Cn on-ody visul hints e used to distort the user s sense of their spce, llowing us to control their rnge of motion? While we hve helped to define wht the design spce of projected hints might look like to explore guidnce first, we hve yet to nswer the question of how well users lern prticulrly movement. In ddition, we hve yet to explore how well users perform the dynmics of movement, prticulrly when users re guided through lternting speeds. Our work hs llowed us to nswer the fundmentl question of whether or not it is possile, ut mny interesting questions lie hed. ACKNOWLEDGMENTS We thnk the Microsoft Reserch VIBE nd ASI groups, s well s the 211 Microsoft Reserch interns, for helping us improve LightGuide. REFERENCES 1. Bird, K.M. nd Brfield, W. Evluting the effectiveness of ugmented relity displys for mnul ssemly tsk. Virtul Relity 4, 4, 1999, 25-259. 2. Bnji, M.R. Stevens, Stnley Smith (196 73). Journl of Personlity nd Socil Psychology, 1994. 3. Bu, O. nd Mcky, W.E. OctoPocus: dynmic guide for lerning gesture-sed commnd sets. Proc. UIST, 28, 37 46. 4. Britnnic. http://ritnnic.com/ebchecked/topic/15794/dnce-nottion. 5. Co, X. nd Blkrishnn, R. Intercting with dynmiclly defined informtion spces using hndheld projector nd pen. Proc. UIST, 26, 225-234. 6. Co, X., Forlines, C., nd Blkrishnn, R. Multi-user interction using hndheld projectors. Proc. UIST, 27, 43-52. 7. Dniel Dementhon, L.D. Model-Bsed oject pose in 25 lines of code. Proc. of IJCV 15, 1 (1995), 123-141. 8. Feiner, Steve, Blire Mcintyre, D.S. Knowledge-Bsed ugmented relity. Commun. ACM 36, 7, 1993, 53-62. 9. Felzenszwl, P.F. nd Huttenlocher, D.P. Distnce trnsforms of smpled functions. 24. 1. Flgg, M. nd Rehg, J.M. Projector guided pinting. Proc. UIST, 26, 235-244. 11. Freemn, D., Benko, H., Morris, M.R., nd Wigdor, D. ShdowGuides: Visuliztions for In-Situ Lerning of Multi- Touch nd Whole-Hnd Gestures. Proc. ITS, 29, 165-172. 12. Hrrison, C., Benko, H., Wilson, A.D., nd Wy, O.M. OmniTouch: Werle multitouch interction everywhere. Proc. UIST, 211, 441-45. 13. Hrrison, C., Tn, D., nd Morris, D. Skinput: Appropriting the ody s n input surfce. Proc. CHI, 21, 453-462. 14. Holz, C. nd Wilson, A. Dt miming: inferring sptil oject descriptions from humn gesture. Proc. CHI, 211, 811 82. 15. Hu, H., Brown, L.D., nd Go, C. Scpe: Supporting Stereoscopic Collortion in Projective Environments. Proc. CG, 24, 66-75. 16. InFocus. http://www.infocus.com. 17. Jco, R. The perceptul structure of multidimensionl input device selection. Proc. CHI, 1992, 211-218. 18. Kne, S.K., Avrhmi, D., Worock, J.O., nd Hrrison, B. Bonfire: A Nomdic system for hyrid lptop-tletop interction. Proc. UIST, 29, 129-138. 19. Kirk, D. nd Stnton Frser, D. Compring remote gesture technologies for supporting collortive physicl tsks. Proc. CHI, 26, 1191-12. 2. Mistry, P. nd Mes, P. WUW-wer Ur world: werle gesturl interfce. Proc. CHI, 29, 4111-4116. 21. Motokw, Y. nd Sito, H. Support system for guitr plying using ugmented relity disply. Proc. ISMAR, 26, 243-244. 22. Neumnn, U. nd Mjoros, A. Cognitive, performnce, nd systems issues for ugmented relity pplictions in mnufcturing nd mintennce. Proc. of VR, 1998, 4-11. 23. Plmiter, S. nd Elkerton, J. An evlution of nimted demonstrtions of lerning computer-sed tsks. Proc. CHI, 1991, 257-263. 24. Pinhnez, C. The Everywhere Displys Projector: A Device to crete uiquitous grphicl interfces. Proc. UiComp, 21, 315-331. 25. Rskr, R., Berdsley, P., Vn Br, J., et l. RFIG Lmps: intercting with self-descriing world vi photosensing wireless tgs nd projectors. Proc. of SIGGRAPH, 24, 46 415. 26. Rskr, R., Welch, G., Cutts, M., Lke, A., Stesin, L., nd Fuchs, H. The office of the future: A unified pproch to imge-sed modeling nd sptilly immersive displys. Proc. SIGGRAPH, 1998, 179-188. 27. Rskr, R., Welch, G., Low, K.L., nd Bndyopdhyy, D. Shder lmps: Animting rel ojects with imge-sed illumintion. Proc. Eurogrphics, 21, 89. 28. Rosenthl, S., Kne, S.K., Worock, J.O., nd Avrhmi, D. Augmenting On-Screen Instructions with Micro-Projected Guides: When it works, nd when it fils. Proc. UiComp, 21, 23-212. 29. SynchronousOjects. http://synchronousojects.osu.edu. 3. Tn, D.S., Pusch, R., nd Hodgins, J. Exploiting the cognitive nd socil enefits of physiclly lrge displys. 24. 31. Wtson, G., Currn, R., Butterfield, J., nd Crig, C. The effect of using nimted work instructions over text nd sttic grphics when performing smll scle engineering ssemly. Proc. CE, 28, 541 55. 32. White, S., Lister, L., nd Feiner, S. Visul Hints for tngile gestures in ugmented relity. Proc. ISMAR, 27, 1-4. 33. Wilson, A.D. nd Benko, H. Comining multiple depth cmers nd projectors for interctions on, ove nd etween surfces. Proc. UIST, 21, 273 282. 34. Zhng, Z. Itertive point mtching for registrtion of free-form curves nd surfces. Journl of Computer Vision 13, 2 1994, 119-152.