AU Value Pediing Video-Confeening Convesaion Ouomes Based on Modeling Faial Epession Snhonizaion Rui Li, Jaed Cuhan and Mohammed (Ehsan) Hoque ROC HCI, Depamen of Compue Siene, Univesi of Rohese, New Yok, USA Sloan College of Managemen, Massahuses Insiue of Tehnolog, Massahuses, USA Absa Effeive video-onfeening onvesaions ae heavil influened b eah speake s faial epession. In his sud, we popose a novel pobabilisi model o epesen ineaional snhon of onvesaion panes faial epessions in video-onfeening ommuniaion. In paiula, we use a hidden Makov model (HMM) o apue empoal popeies of eah speake s faial epession sequene. Based on he assumpion of muual influene beween onvesaion panes, we ouple hei HMMs as wo ineaing poesses. Fuhemoe, we summaize he muliple oupled HMMs wih a sohasi poess pio o disove a se of faial snhonizaion emplaes shaed among he muliple onvesaion pais. We validae he model, b uilizing he ehibiion of hese faial snhonizaion emplaes o pedi he ouomes of video-onfeening onvesaions. The daase inludes 75 video-onfeening onvesaions fom 50 Amazon Mehanial Tukes in he one of a new eui negoiaion. The esuls show ha ou poposed model ahieves highe aua in pediing negoiaion winnes han suppo veo mahine and anonial HMMs. Fuhe analsis indiaes ha some snhonized nonvebal emplaes onibue moe in pediing he negoiaion ouomes. I. INTRODUCTION Video-onfeening (VC) beomes a popula plafom fo people o inea in pofessional and pesonal apaiies [5]. On he ohe hand, some ehnial issues sill eis, fo eample he limied view of he peson, disengaged ee ona, and oasional ineupions esuling fom newok laen. These issues disup soial pesene, and hus lead o poo VC ommuniaion []. Nonvebal behavio plas a signifian ole o enhane soial pesene. I povides a soue of ih infomaion abou he speake s inenions, goals, and values [][][4]. This moivaes us o invesigae faial epessions in VC ommuniaion in ode o gain insigh ino effeive ommuniaive skills ha will impove poduivi and onvesaional saisfaion. In his sud, we invesigae ineaional snhon of faial epessions in VC-mediaed onvesaions, as shown in Figue. Ineaional snhon efes o paened and aligned ineaions ouing ove ime []. In a snhoni ineaion, nonvebal behavios (e.g., faial epessions, posue, gesue) of he individuals ae oodinaed o he hhms and foms of vebal epessions. As a ke indiao of ineaional involvemen, appo, and muuali, i has been used in deepion deeion, online leaning, inepesonal us evaluaion, and a vaie of ohe fields [][7][8][0]. Howeve, he quanifiaion of This wok was paiall suppoed b DARPA Fig. : An illusaion of one VC onvesaion pai. The wo paiipans ommuniae via ou web-based VC plafom in hei own naual envionmens. The lowe panels show he fis si pinipal omponens of hei faial epession aion unis (AUs) evolving ove ime. ineaional snhon is hallenging, and i depends on he speifi soial one. We addess his hallenge b modeling faial epession snhonizaion of VC onvesaion panes given he soial one of negoaion. We popose a novel pobabilisi model o lean an effeive epesenaion of faial ineaional snhon. This epesenaion onains a se of faial snhonizaion emplaes displaed b muliple onvesaion pais, as shown in Fig. In paiula, we uilize a hidden Makov model (HMM) o desibe he empoal popeies of eah speake s faial epession. The Makovian pope assumes ha if a speake smiles a pevious ime sep, i is likel ha he/she mainains he smile a he uen ime sep, fo insane. We fuhe assume ha hee eiss muual influene beween a pai of onvesaion panes. Namel, if a speake s onvesaion pane displas a smile, i is likel ha he speake esponds wih a smile. To apue he muual influene beween a pai of onvesaion panes, we ouple hei wo HMMs ogehe as ineaing poesses. We hus model he muliple onvesaion pais wih he oesponding muliple oupled HMMs. Fuhemoe, we summaize he muliple oupled HMMs b inoduing a sohasi poess as a pio. This pio allows us o unove he shaed faial snhonizaion emplaes among he muliple onvesaion pais. In his epesenaion, a ouple of onvesaion panes faial epessions an be deomposed ino insaniaions of a paiula subse of he globall shaed snhonizaion emplaes. This novel epesenaion of faial epession snhonizaion enables
Fig. : Diagam of ou appoah illusaed on one onvesaion pai. Fom lef o igh, 8 faial epession aion unis (AUs) ae eaed fom he onvesaion panes videos using CERT oolbo [9]; he ime seies of he fis 6 pinipal omponens ae ansfomed fom boh he AUs wih he pinipal omponen analsis (PCA), and hese si-dimensional PC ime seies ae he inpu o ou model; he model auomaiall deomposes he faial epession ime seies ino he salien segmens (olo oded) whih oespond o a subse of globall shaed faial snhonizaion emplaes displaed b his pai. us o no onl inepe effeive VC ommuniaion skills bu also pedi he ouomes of he onvesaions. To ondu his sud, we develop a VC ssem ha woks via a web bowse wihou an addiional download o plugin suppo. The plafom is designed o enable auo audio and video upload in a emoe seve eve 0 seonds as wo people engage in a video-onfeene. This funionali allows he famewok o be deploed in Amazon Mehanial Tuk, whee emoe human wokes wih aess o a web bowse and webam ommuniae wih eah ohe. To he bes of ou knowledge, his is he fis sud o invesigae VC-mediaed faial epession snhon. The onibuions of ou sud inlude: We build a novel pobabilisi model o lean an effeive epesenaion of faial ineaional snhon in VC ommuniaion. This novel epesenaion deomposes muliple pais of onvesaion panes faial epession sequenes ino a se of globall shaed snhonizaion emplaes. We fuhe epesen a onvesaion b he fequenies of ouene of is faial snhonizaion emplaes, and ahieve supeio aua (78% on aveage) han suppo veo mahine (SVM) and anonial HMMs o pedi onvesaion ouomes. II. METHOD OF COMPUTER-MEDIATED NEGOTIATION STUDY We validae ou model using he daase olleed fom a sud engaging Mehanial Tukes in a euimen negoiaion []. In his sud, he onvesaional speeh and faial epessions ae eoded. The ouomes ae he numbe of poins eaned b eah paiipan and a pos-negoiaion quesionnaie egading he paiipans evaluaion of hei ounepas and he negoiaion poess. A. Paiipans 4 Mehanial Tukes paiipae in he sud. Paiipans ae infomed ha hei negoiaions would be eoded and ha he sud s pupose is o invesigae negoiaion skills. The daa olleed fom 50 of he Tukes is available fo fuhe analsis. Among hem, 4 paiipans (%9) ae female. The emaining Tukes eihe had damaged videos o laked pos-quesionnaie daa. B. Appaaus The negoiaos inea wih eah ohe hough a ompue-mediaed plafom based on a bowse-based VC ssem. The eising feel available video sofwae (e.g., Skpe, Google+ Hangous) ofen equies uses o download hei appliaion o insall a plugin. In addiion, Skpe s uen API and Google+ do no allow us o apue and eod audio o video seams. To handle hese hudles, we develop ou own bowse-based VC ssem ha is apable of apuing and analzing he video seam in he loud. We implemen he funionali o ansfe audio and video daa eve 0 seonds o peven daa loss and dnamiall adap o vaian newok laen. C. Task The ask of his epeimen is defined as a euimen ase. A euimen ase involves a senaio in whih a andidae, who alead has an offe, negoiae he ompensaion pakage wih he euie. The andidaes and he euies need o eah an ageemen on eigh issues elaed o sala, job assignmen, loaion, vaaion ime, bonus, moving epense eimbusemen, saing dae, and healh insuane. Eah negoiaion issue offes 5 possible opions fo esoluion. Eah opion is assoiaed wih a speifi numbe of poins fo eah pa. The goal of he negoiaos is o maimize he oal poins he an possibl ean (e.g., he 5 opional offes on sala issue ange fom 65K o 45K. Candidae eeives maimum poins if he/she ould sele wih sala of 65k wheeas euie loses maimum poins, and vie vesa). D. Poedue As Amazon mehanial Tukes ake he HIT, he ae fomed ino 75 onvesaion pais sequeniall. The soial
B 0 B 0 i E i P i i E i P i i E i P i Ti Ti Ti Ti Ti Ti Ti Ti Ti Ti Ti Ti Fig. : Ou poposed pobabilisi gaphial model. Fo eah speake s faial epession sequene, we use a HMM o desibe is dnami poess duing he onvesaion. We fuhe ouple he wo HMMs of he onvesaion panes o desibe hei muual influene, while allowing eah of hem o mainain his/he own dnami poess. A he op level, we use a bea poess pio o summaize he faial snhonizaion emplaes shaed aoss muliple onvesaion pais. In his hieahial suue, eah onvesaion pai ehibis a paiula subse of he globall shaed faial snhonizaion emplaes. Shaded disks epesen he obseved faial epessions in he video fames. oles ae andoml assigned o he onvesaion panes. The paiipans oodinae wih hei panes o hoose he loaions and imes fo he VC-based negoiaion, so he ma inea in onvenien and omfoable iumsanes. Afe boh paiipans povide onsen, a buon appeas ha leads eah individual o he oe video ha oom wih signals ha he wo an speak wih eah ohe. The paiipans hen poeed o pla ou he senaio oulined in hei insuions. Reoding begins when he wo paiipans build he onneion, and sops when one paiipan hangs upon ompleion of he negoiaion. Paiipans ae fee o offe infomaion, agumens, and poposals, alhough he ma no ehange hei onfidenial insuions. The andidaes won 47 onvesaions. III. HIERARCHICAL COUPLED HIDDEN MARKOV MODEL The gaphial model is desibed as wo levels. A he lowe level, hee ae N oupled HMMs oesponding o N onvesaion pais. A he op level, We use a Bea poess pio o disove he faial snhonizaion emplaes shaed among hese disin e elaed onvesaion pais in he given soial one, as shown in Fig.. A. Dnami Likelihoods We assume ha he onvesaion panes faial epessions ae inedependen, and inea b influening eah ohe s emoional saes o ommuniaive saegies. Addiionall, eah speake s faial epession sequene mainains is own inenal dnami. To enode hese assumpions, we ouple wo Makov hains via a mai of ondiional pobabiliies beween hei hidden sae vaiables. We denoe he obsevaions of he ih onvesaion pai s faial epession sequenes as O i = { :T i, :T i }, whee :T i ae he obseved faial epessions of he andidae, and :T i ae he euie s. The obsevaions ae The PCA omponens of he faial epession AUs eaed fom he videos of a onvesaion pai. We fuhe define S i = { :T i, :T i } as he hidden sae sequenes, whee :T i ae he hidden saes of he andidae, and :T i epesen he hidden saes of he euie. These hidden saes inde some paened faial epessions of boh onvesaion panes. The sae ansiion pobabiliies ae defined as +, Mul(π +, Mul(π,, ) () ) () The emission disibuions ae defined as nomal disibuions: N(µ N(µ B. Combinaoial Pio,Σ,Σ ) () ) (4) We popose o use a bea poess pio o summaize he seeopial and idiosnai faial snhonizaion ehibied b he muliple onvesaion pais. This pio no onl allows fleibili in he numbe of faial snhonizaion emplaes, bu also enables eah onvesaion pai o ehibi a subse of he globall shaed emplaes.
Templae Templae Templae Templae 4 Templae 5 Templae 6 Templae 7 Templae 8 Templae 9 Fig. 5: Illusaion of insaniaions of faial snhonizaion emplaes leaned b ou model, wih andidaes on he lef and euies on he igh. Eah olumn onains he insaniaions of one paiula emplaes shaed among he negoiaion pais, and eah ow epesens one pai of negoiaos. Ou model allows he pais o ehibi diffeen subses of he globall shaed snhonizaion emplaes. Fo eample, Templae epesens a ase in whih boh paies ehibi neual faial epessions o eah ohe. Templae 4 is a ase whee andidaes appea disaed, while euies espond wih a smile. In Templae 8, andidaes ehibi suble polie smiles while euies show neual faes. a paiula ombinaion of wo paened faial epessions, as visualized in Fig. 4. The olo-oded segmens sugges ha he negoiaion pai peiodiall displa some faial snhonizaion emplaes. In Fig. 4, sine he daa luses ae visualized in he fis pinipal omponen spae, some sepaaions ma no be obvious. The ansiion pobabili mai indiaes ha hese snhonizaion emplaes ae pesisen wih high self-ansiion pobabiliies. B. Faial Snhonizaion Templaes Fig. 5 demonsaes a mai of he shaed faial snhonizaion emplaes esimaed fom he 75 negoiaion pais in ou sud. The eploao inepeaions of hese emplaes ae summaized in Table I. Eah emplae an be quanified b he ombinaion of he mean veo and he ovaiane mai of one andidaes faial luse and he mean veo and he ovaiane mai of one euies faial luse, illusaed in Fig. 4. Noe ha we onl illusae a subse of global snhonized emplaes leaned fom he daase whih ae shaed mos fequenl. Among he epoed emplaes, Sine emplae, 4, 6, and 8 do no involve speaking. he ae labeled as nonvebal emplaes. C. Negoiaion Ouome Pediion To measue he pefomane of ou novel epesenaion of he negoiaion poesses, we o pedi he negoiaion TABLE I: Inepeaion of Faial Snhonizaion Templaes. Templae Candidae Reuie Templae neual fae neual fae neual fae speaking Templae lisening holding he un speaking Templae smile holding he un looking down Templae 4 lisening smile speaking suble smile Templae 5 holding he un lisening neual fae Templae 6 big smile lisening looking awa speaking neual fae Templae 7 smiling lisening evealing infomaion Templae 8 suble polie smile neual fae looking down big smile Templae 9 speaking lisening
ouomes based he faial snhonizaion emplaes. We andoml assign he daa ino aining and esing ses. In paiula, eah negoiaion pai s negoiaion poess is epesened b he fequenies of ouene of is subse of emplaes, and he gound-uh of a negoiaion winne is deemined b he poins eah pa eaned in he negoiaion. We eamine pediion pefomane given he aining ses onaining he vaious numbe of emplae insaniaions fo he 75 pais faial epession sequenes. We have o assume eah negoiaion pai ehibis he same se of emplaes in ode o implemen he anonial HMMs. On he ohe hand, we use he oesponding segmens of he faial epession ime seies o ain a SVM. Figue 6 indiaes ha ou model leads o signifian impovemen in pediion pefomane, paiulal when fewe aining insaniaions ae available. Canonial HMMs esseniall ompue a se of aveaged emplae fom he 75 pais of faial epession sequenes. This epesenaion blus he disinion beween he onvesaion pais faial epessions as vaiaions. A majo ause o SVM s infeio pefomane is ha i does no aoun fo he empoal infomaion of he sequenial daa. Ou poposed model addesses hese issues. In paiula, he highes weighs ae assigned o Templae 6 (nonvebal), 8 (nonvebal), and 9. This suggess ha mos pediive infomaion is deived fom hese faial snhonizaion emplaes, mos of whih ae nonvebal emplaes. V. CONCLUSIONS This pape invesigae faial epession snhon in a ompue-mediaed negoiaion based on video-onfeening onvesaions. We fuhe pesen a pobabilisi dnami model o auomaiall lean a se of faial snhonizaion emplaes. These emplaes ae shaed among negoiaion pais while he engage in a simulaed negoiaion ask via a VC plafom. The validaion of hese faial snhonizaion emplaes suggess ha some pue nonvebal emplaes ae song indiaos of he negoiaion ouomes. This novel appoah allows us o eognize he negoiaion skills, and pedi he negoiaion ouomes. Fo eample, in eal-life senaio, pofessional negoiaos ma be ained o onol hei faial epessions o hide hei feeling. Ou appoah an onibue o evaluae hei pefomane and he effeiveness of he ais. The disoveed faial snhonizaion emplaes an be embedded wih aive leaning sheme as o evaluae VC ommuniaion skill and povide eal-ime feedbak in ompue-mediaed ommuniaion. Ou model an also be genealized o analze ohe onvesaion senaios suh as ineview, usome sevie, and ele-mediine. Fig. 6: ROC uve summaizing pediion pefomane fo negoiaion winnes. Lef: Aea unde aveage ROC uves fo diffeen numbes of emplae eemplas. Righ: We ompae ou model wih anonial HMM and SVM. VI. ACKNOWLEDGMENTS The auhos aknowledge he help of Kazi Tasnif Islam, Anis Kallel and RuJie Zhao fo he daa olleion. REFERENCES [] N.E. Dunba, M.L. Jensen, D.C. Towe and J.K. Bugoon, Snhonizaion of Nonvebal Behavios in Deeing Mediaed and Nonmediaed Deepion, Nonvebal Behav. J., vol. 8, 04, pp 55-76. [] J.R. Cuhan, R. Li and M.E. Hoque, Pediing Negoiaion Ouomes fom Smiles, in pepaaion, 05. [] C.N. Gunawadena, Soial Pesene Theo and Impliaions fo Ineaion and Collaboaive Leaning in Compue Confeenes, Eduaional Teleommuniaions Ine. J., vol., 995, pp 47-66. [4] J.B. Walhe, Compue-Mediaed Communiaion Impesonal, Inepesonal, and Hpepesonal Ineaion, Communiaion Reseah J., vol., 996, pp -4. [5] J. Caukin, 5 Million People Conuenl Online on Skpe, Reieved De., 0 fom Skpe. [6] Paul Ekman and W. V. Fiesen, Faial Aion Coding Ssem: A Tehnique fo he Measuemen of Faial Movemen, Consuling Pshologiss Pess, 978. [7] M. Gaie, Epessive Timing and Ineaional Snhon beween Mohes and Infans: Culual Similaies, Culual Diffeenes, and he Immigaion Epeiene, Cog. Dev. J., vol. 8, 004, pp 5-554. [8] J. Cassell, Embodied Convesaional Inefae Agens, Comm. of he ACM J., vol. 4, 000, pp 70-78. [9] X. Yu, S. Zhang, Y. Yu and N. Dunba, The Compue Epession Reogniion Toolbo (CERT),in Sevenh IEEE Inenaional Confeene on Auomai Fae and Gesue Reogniion, Sana Babaa, CA, 0, pp. 98-05. [0] G. Lilewo, J. Whiehill, T. Wu, I. Fasel, M. Fank, J. Movellan and M. Bale, Auomaed Analsis of Ineaional Snhon using Robus Faial Taking and Epession Reogniion,in Tenh IEEE Inenaional Confeene on Auomai Fae and Gesue Reogniion, Shanghai, China, 04, pp. -6. [] M. Muhlenbok and U. Hoppe, Compue Suppoed Ineaion Analsis of Goup Poblem Solving, in Thid Inenaional Confeene on Compue Suppo fo Collaboaive Leaning, Palo Alo, CA, 999, pp. 50. [] R. Thibau and M.I. Jodan, Hieahial Bea poesses and he Indian Buffe Poess. in Tenh Inenaional Confeene on Aifiial Inelligene and Saisis, San Juan, Pueo Rio, 007, pp. 564 57.