ISSN 0103-9741. Monografias em Ciência da Computação n 17/11. Automatic Video Editing for Video-Based Interactive Storytelling



Similar documents
Average and Instantaneous Rates of Change: The Derivative

OPTIMAL FLEET SELECTION FOR EARTHMOVING OPERATIONS

Verifying Numerical Convergence Rates

Geometric Stratification of Accounting Data

Comparison between two approaches to overload control in a Real Server: local or hybrid solutions?

Research on the Anti-perspective Correction Algorithm of QR Barcode

How To Ensure That An Eac Edge Program Is Successful

Lecture 10: What is a Function, definition, piecewise defined functions, difference quotient, domain of a function

Instantaneous Rate of Change:

Chapter 11. Limits and an Introduction to Calculus. Selected Applications

Math Test Sections. The College Board: Expanding College Opportunity

Guide to Cover Letters & Thank You Letters

- 1 - Handout #22 May 23, 2012 Huffman Encoding and Data Compression. CS106B Spring Handout by Julie Zelenski with minor edits by Keith Schwarz

An inquiry into the multiplier process in IS-LM model

ACT Math Facts & Formulas

Schedulability Analysis under Graph Routing in WirelessHART Networks

The EOQ Inventory Formula

Catalogue no XIE. Survey Methodology. December 2004

2.23 Gambling Rehabilitation Services. Introduction

The Dynamics of Movie Purchase and Rental Decisions: Customer Relationship Implications to Movie Studios

A system to monitor the quality of automated coding of textual answers to open questions

Working Capital 2013 UK plc s unproductive 69 billion

1.6. Analyse Optimum Volume and Surface Area. Maximum Volume for a Given Surface Area. Example 1. Solution

An Orientation to the Public Health System for Participants and Spectators

2 Limits and Derivatives

SAT Subject Math Level 1 Facts & Formulas

Operation go-live! Mastering the people side of operational readiness

Note nine: Linear programming CSE Linear constraints and objective functions. 1.1 Introductory example. Copyright c Sanjoy Dasgupta 1

SAMPLE DESIGN FOR THE TERRORISM RISK INSURANCE PROGRAM SURVEY

EC201 Intermediate Macroeconomics. EC201 Intermediate Macroeconomics Problem set 8 Solution


SWITCH T F T F SELECT. (b) local schedule of two branches. (a) if-then-else construct A & B MUX. one iteration cycle

The modelling of business rules for dashboard reporting using mutual information

SHAPE: A NEW BUSINESS ANALYTICS WEB PLATFORM FOR GETTING INSIGHTS ON ELECTRICAL LOAD PATTERNS

Design and Analysis of a Fault-Tolerant Mechanism for a Server-Less Video-On-Demand System

What is Advanced Corporate Finance? What is finance? What is Corporate Finance? Deciding how to optimally manage a firm s assets and liabilities.

Volumes of Pyramids and Cones. Use the Pythagorean Theorem to find the value of the variable. h 2 m. 1.5 m 12 in. 8 in. 2.5 m

2.12 Student Transportation. Introduction

NAFN NEWS SPRING2011 ISSUE 7. Welcome to the Spring edition of the NAFN Newsletter! INDEX. Service Updates Follow That Car! Turn Back The Clock

MATHEMATICS FOR ENGINEERING DIFFERENTIATION TUTORIAL 1 - BASIC DIFFERENTIATION

A strong credit score can help you score a lower rate on a mortgage

FINITE DIFFERENCE METHODS

Theoretical calculation of the heat capacity

Broadband Digital Direct Down Conversion Receiver Suitable for Software Defined Radio

Digital evolution Where next for the consumer facing business?

Notes: Most of the material in this chapter is taken from Young and Freedman, Chap. 12.

Unemployment insurance/severance payments and informality in developing countries

Tangent Lines and Rates of Change

Staying in-between Music Technology in Higher Education

Can a Lump-Sum Transfer Make Everyone Enjoy the Gains. from Free Trade?

Math 113 HW #5 Solutions

Optimized Data Indexing Algorithms for OLAP Systems

New Vocabulary volume

Derivatives Math 120 Calculus I D Joyce, Fall 2013

His solution? Federal law that requires government agencies and private industry to encrypt, or digitally scramble, sensitive data.

Writing Mathematics Papers

College Planning Using Cash Value Life Insurance

Determine the perimeter of a triangle using algebra Find the area of a triangle using the formula

Chapter 10: Refrigeration Cycles

M(0) = 1 M(1) = 2 M(h) = M(h 1) + M(h 2) + 1 (h > 1)

Floodless in SEATTLE: A Scalable Ethernet Architecture for Large Enterprises

1. Case description. Best practice description

Strategic trading in a dynamic noisy market. Dimitri Vayanos

h Understanding the safe operating principles and h Gaining maximum benefit and efficiency from your h Evaluating your testing system's performance

Tis Problem and Retail Inventory Management

Predicting the behavior of interacting humans by fusing data from multiple sources

Free Shipping and Repeat Buying on the Internet: Theory and Evidence

Area-Specific Recreation Use Estimation Using the National Visitor Use Monitoring Program Data

f(a + h) f(a) f (a) = lim

Haptic Manipulation of Virtual Materials for Medical Application

Yale ICF Working Paper No May 2005


In other words the graph of the polynomial should pass through the points

a joint initiative of Cost of Production Calculator

Pre-trial Settlement with Imperfect Private Monitoring

Human Capital, Asset Allocation, and Life Insurance

Research on Risk Assessment of PFI Projects Based on Grid-fuzzy Borda Number

Large-scale Virtual Acoustics Simulation at Audio Rates Using Three Dimensional Finite Difference Time Domain and Multiple GPUs

Staffing and routing in a two-tier call centre. Sameer Hasija*, Edieal J. Pinker and Robert A. Shumsky

Torchmark Corporation 2001 Third Avenue South Birmingham, Alabama Contact: Joyce Lane NYSE Symbol: TMK

Industrial Robot: An International Journal Emerald Article: High-level robot programming based on CAD: dealing with unpredictable environments

Training Robust Support Vector Regression via D. C. Program

The Demand for Food Away From Home Full-Service or Fast Food?

How doctors can close the gap

THE NEISS SAMPLE (DESIGN AND IMPLEMENTATION) 1997 to Present. Prepared for public release by:

Projective Geometry. Projective Geometry

Using Intelligent Agents to Discover Energy Saving Opportunities within Data Centers

Dynamically Scalable Architectures for E-Commerce

For Sale By Owner Program. We can help with our for sale by owner kit that includes:

Optimizing Desktop Virtualization Solutions with the Cisco UCS Storage Accelerator

Message, Audience, Production (MAP) Framework for Teaching Media Literacy Social Studies Integration PRODUCTION

ANALYTICAL REPORT ON THE 2010 URBAN EMPLOYMENT UNEMPLOYMENT SURVEY

A survey of Quality Engineering-Management journals by bibliometric indicators

EUROSYSTEM. Working Paper

Simultaneous Location of Trauma Centers and Helicopters for Emergency Medical Service Planning

Orchestrating Bulk Data Transfers across Geo-Distributed Datacenters

Distances in random graphs with infinite mean degrees

MULTY BINARY TURBO CODED WOFDM PERFORMANCE IN FLAT RAYLEIGH FADING CHANNELS

CHAPTER 7. Di erentiation

Pioneer Fund Story. Searching for Value Today and Tomorrow. Pioneer Funds Equities

Transcription:

PUC ISSN 0103-9741 Monografias em Ciência da Computação n 17/11 Automatic Video Editing for Video-Based Interactive Stortelling Edirlei Soares de Lima Bruno Feijó Antonio L. Furtado Angelo Ernani Maia Ciarlini Cesar Tadeu Pozzer Departamento de Informática PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO RUA MARQUÊS DE SÃO VICENTE, 225 - CEP 22453-900 RIO DE JANEIRO - BRASIL

Monografias em Ciência da Computação, No. 17/11 ISSN: 0103-9741 Editor: Prof. Carlos José Pereira de Lucena December, 2011 Automatic Video Editing for Video-Based Interactive Stortelling Edirlei Soares de Lima, Bruno Feijó, Antonio L. Furtado, Angelo Ernani Maia Ciarlini (UNIRIO), Cesar Tadeu Pozzer (UFSM) {elima, bfeijo, furtado}@inf.puc-rio.br, angelo.ciarlini@uniriotec.br, pozzer@inf.ufsm.br Abstract. Te development of interactive stortelling sstems wit te qualit of feature films is a ard callenge. A promising approac to tis problem is te use of recorded videos to dramatize te stories. However, in tis approac, automatic video editing is a critical stage. In tis paper, we present an effective metod based on cinematograp principles tat automaticall edit segments of videos in real-time, wile te plot is being generated b an interactive stortelling sstem. Kewords: Video editing, interactive stortelling, virtual cinematograp. Resumo. O desenvolvimento de sistemas stortelling interativo com a qualidade de filmes de longa-metragem é um grande desafio. Uma abordagem promissora para este problema é a utilização de vídeos pré-gravados para dramatizar as istórias. No entanto, nesta abordagem, a edição de vídeo automática é uma fase crítica. Neste artigo, apresentamos um método baseado em princípios de cinematografia que é capaz de editar automaticamente segmentos de vídeos em tempo real, enquanto o roteiro está sendo gerado por um sistema de stortelling interativo. Palavras-cave: Edição de vídeo, stortelling interativo, cinematografia virtual.

In carge of publications Rosane Teles Lins Castilo Assessoria de Biblioteca, Documentação e Informação PUC-Rio Departamento de Informática Rua Marquês de São Vicente, 225 - Gávea 22453-900 Rio de Janeiro RJ Brasil Tel. +55 21 3114-1516 Fa: +55 21 3114-1530 E-mail: bib-di@inf.puc-rio.br ii

1 Introduction Since te advent of projected motion pictures in te late 19t centur, cinema as been promoting new forms of immersive eperience from te silent black-and-wite film to te ig-definition stereoscopic 3D projection. Eac one of tese forms as been focusing on te imaging and auditor side of te eperience. Following a different track, films wit interactive plots ave been recentl proposed as a new eperience. Te area of interactive stortelling ma provide te proper foundation for tis new art form. However tis researc area as few works oriented to motion pictures. Interactive stortelling sstems tr to andle te unpredictabilit of events b using 3D virtual worlds to dramatize te stories [1][2][3]. Tis approac provides te freedom to move te caracters and cameras to an place in te virtual world, allowing te sstem to sow te stor events from an angle or perspective. Furtermore, 3D caracters ma ave teir properties easil canged, suc as sape, personalit, emotional state, and intents. However, tis freedom usuall sacrifices te visual qualit of te real-time narrative. Te visual result of an current interactive stortelling sstem tat uses 3D worlds to dramatize te stories is far from te ecellent qualit of a computer animated feature film. Te development of interactive stortelling sstems wit te qualit of feature films is a ard callenge, even considering te new tecnolog used b te videogame industr. A promising approac to tis problem is te replacement of virtual 3D caracters b video sequences in wic real actors perform predetermined actions. Tis approac, we call video-based interactive stortelling in tis paper, as sowed some interesting results in te last ears [4][5][6]. However, most of tose results are domain specific applications or do not ave te intention of presenting a general approac to andle te problems of a generic video-based interactive stor. We claim tat te task of developing a general video-based interactive stortelling sstem sould firstl solve two basic problems: real-time video compositing and realtime video editing. In tis paper we focus on te second problem onl. We present a metod based on cinematograp principles to automaticall edit segments of videos in real-time, wile te plot is being generated b an interactive stortelling sstem. Tis paper is organized as follows. Section 2 presents previous works. Section 3 presents te arcitecture of our stortelling sstem. Section 4 presents our metod to automaticall edit te segments of video. In section 5, we analze te performance and accurac of te results to demonstrate te efficienc of our approac. Finall, in section 6, we present concluding remarks. 2 Related Works Man works ave alread been done wit te objective of appling videos as a form of visual representation of interactive narratives. Bocconi et al. [7] present Vo Populi, a sstem tat generates video documentaries based on verbal annotations in te audio cannel of te video segments. In a similar work, Cua and Ruan [8] designed a sstem to support te process of video information management, i.e.: segmenting, logging, retrieving, and sequencing of video data. Tis later sstem semi-automaticall detects and annotates sots for furter retrieval based on a specified time constraint. Aanger and Little [9] present an automated sstem to compose and deliver segments 1

of news videos. In a similar approac, but not focusing on news, Nack and Parkes [10] present a metod to establis continuit between segments of videos using rules based on te content of te segments. Te idea of a generic framework for te production of interactive narratives is eplored b Urso et al. [6]. Te autors present te SapeSifting Media, a sstem designed for te production and deliver of interactive screen-media narratives. Te productions are mainl made wit pre-recorded video segments. Te variations are acieved b te automatic selection and rearrangement of atomic elements of content into individual narrations. Te sstem does not incorporate an mecanism for te automatic generation of stories. Essentiall, teir approac is to empower te umancentered autoring of interactive narratives rater tan attempting to build sstems tat generate narratives temselves. A similar approac is used b Sen et al. [11], in wic te autors present a video editing sstem tat elps users to compose sequences of scenes to tell stories b selecting video segments from a corpus of annotated clips. Anoter researc problem closel related to automatic video editing is video summarization, wic refers to te process of creating a summar of a digital video. Tis summar, wic must contain onl ig priorit entities and events from te video, sould eibit reasonable degrees of continuit and sould be free of repetition. A classical approac to video summarization is presented b Ma et al. [12]. In a recent researc work, Porteous et al. [5] present a video-based stortelling sstem tat generates multiple stor variants from a baseline video. Te video content is generated b an adaptation of video summarization tecniques tat decompose te baseline video into sequences of interconnected sots saring a common semantic tread. Te video sequences are associated wit stor events and alternative storlines are generated b te use of AI planning tecniques. Te main problem of tis work is te lack of cinematograp tecniques necessar to andle continuit issues, especiall because teir sstem uses video segments etracted from linear films. Wen tese video segments are jointed in different orders, several continuit failures ma occur. Te AI planning algoritm ensures te stor continuit, but not te visual continuit of te film. In tis paper we present a metod for automatic video editing in video-based interactive stortelling applications. In contrast wit te previous works, we present a metod based on cinematograp teor to automaticall select appropriated transitions, avoid jump cuts, and create smoot and continuous interactive video sequences. Te oter autors focus mainl in te creation of stories b ordering video segments, witout concerning about te wa ow tese segments will be connected. 3 Sstem Arcitecture Te present work is part of te Logtell Project [13]. Logtell is an interactive stortelling sstem based on temporal modal logic, nonlinear planning, and nondeterministic events, wic conciliates a plot-based approac [14] wit a caracter-based approac [1]. For eac class of caracter, tere are goal-inference rules tat provide objectives to be acieved b te caracters wen certain situations are observed. For eample, witin a fair tale genre, if a victim V is killed b a villain L, a ero will want to avenge te victim: 2

(victim(v) villain(l) kill(l,v) alive(l)) were and are te temporal modal operators alwas olds and eventuall olds respectivel; and and are te logical smbols of negation and conjunction respectivel. Tis goal-inference rule does not eplicitl refer to eroes, because te overall specification of te fair tale genre will effectivel eclude an oter caracter wo is not a ero. In Logtell, a plot is an ordered sequence of events obtained b a non-linear planning algoritm and an event e i is eecuted b a nondeterministic automaton (NDA) composed b actions a i (Figure 1). = e i 1 2 4 a i 3 5 NDA Eac event e i is eecuted b a Nondeterministic Automaton Figure 1. Plot, events e i, and nondeterministic automata wit actions a i. Double circles mean final states. 6 Te audience can interact wit te stor b suggesting events to te planning algoritm and/or interfering in te nondeterministic automata in a directl or indirectl wa. Canges in te environment or in te parameters of a caracter during te eecution of a nondeterministic automaton are eamples of indirect interferences. An eample of direct interference would be a precise coice from a state tat as bifurcations (e.g. state 2 in Figure 1). Details on plot generation and NDA specification in X can be found in [13]. In te present paper, we are concerned wit te proposal of a new video-based dramatization arcitecture tat can be integrated into sstems like X. Te proposed dramatization arcitecture (Figure 2) is composed b tree cinematograp-inspired agents: Director, Scene Composer, and Editor. Te Director is responsible for interpreting te stor events (planner output), retrieving videos from te remote Video Database, and also for controlling te dramatization of te stor events. Te Scene Composer, using real-time compositing tecniques (suc as croma ke) is responsible for combining te visual elements (video sources) tat compose te scenes, creating te illusion tat all te scene elements are parts of te same video. Te Editor agent, using cinematograp knowledge of video editing, is responsible for keeping te temporal and spatial continuit of te film. Tis later agent accomplises its mission b automaticall selecting te appropriated transitions between video segments, avoiding jump-cuts and presenting te stor events troug a smoot and continuous video sequence. In te Video Database, eac action is associated to a coverage and a master scene video (a structure tat is detailed in te net section). 3

Figure 2. Te proposed video-based dramatization arcitecture. Tis paper presents te Cinematograp Editor agent onl. Te proposed metod can be used b an interactive stortelling sstem tat uses video segments to dramatize interactive stories. 4 Te Cinematograp Editor Agent A film can create its own time and space to fit an particular stortelling situation. Time ma be compressed or epanded; speeded or slowed; remain in te present, go forward or backward. Space ma be sortened or stretced; moved nearer or farter; presented in true or false perspective; or be completel remade in to a setting tat ma eist onl on film [15]. Independentl of all spatial and temporal awkwardness a filmmaker ma purposel create in a film, it is alwas important to keep te spectator s mind oriented in time and space. Cinematograp defines a set of rules and principles tat, wen used correctl, ensure te basic spatial and temporal continuit of a film. Some of tese rules can be applied wen te film is being sot and oters during te editing process. Continuit editing is a set of editing practices tat establis spatial and temporal continuit between sots and keep te narrative moving forward logicall and smootl, witout disruptions in space or time [16]. Continuit editing is used to join sots togeter to create dramatic meanings. Wit an effective editor, te audience will not notice ow sots of various frame sizes and angles are spliced togeter to tell te stor. Te best editing is usuall te unobtrusive editing [15], tat is, te one in wic te audience does not notice tat te editor joined te sots. Our approac to create a computer program tat is able to automaticall edit video segments consists in translating cinematograp principles and practices directl into logical rules. Te main problem of using videos to dramatize an interactive narrative is te lack of freedom occasioned b immutable prerecorded segments of videos, especiall during an automatic editing pase. For eample, suppose te following conditions are detected: tere are two video segments tat cannot be joined togeter witout sacrificing film continuit; bot scenes are important to te stor; and te order of te 4

segments cannot be cange. In tis case, te onl option tat eists is to break te film continuit. Our solution to solve tis problem is te use of te master scene filming metod. In tis metod, we produce te master scene (a sot tat includes te wole setting) along wit te coverage sots (sots tat reveal different aspects of te action). In tis wa, te editor as alwas two sots of eac scene. If te agent detects a problem in te coverage sots, a new sot of te same action can be etracted from te master scene (Figure 3). Furtermore, tis metod gives to te editor te freedom to cut creativel and alter te pacing, te empasis, and even te point of view of te scenes. Filming a scene wit te master scene metod can be done using a single camera or multiple cameras [17]. Te master scene metod is used in probabl 95% of narrative films sots toda [16]. Figure 3. Te master scene structure. Te basic purpose of te Editor agent is to join te selected video segments and keep te film continuit. A cinematograp principle used b conventional editors to join and maintain te continuit between segments of videos is te use of adequate scene transitions. Tere are four basic was to transit from one sot to anoter [18]: Cut: Consists of an instantaneous cange from one sot to te net. It is most often used were te action is continuous and wen tere is no cange in time or location. Dissolve: Consists of a gradual cange from te ending of one sot into te beginning of te net sot. Te dissolve is correctl used wen tere is a cange in time or location, te time needs to be slowed down or speed up, and wen tere is a visual relationsip between te outgoing and te incoming images. Wipe: Consists of a line or sape tat moves across te screen removing te image of te sot just ending wile simultaneousl revealing te net sot beind te line or sape. Te wipe is correctl used were tere is a cange in te location and wen tere is no strong visual relationsip between te outgoing and te incoming frames. Fade: Consists of a gradual cange from a full visible image into a solid black screen (fade-out) and a gradual cange from a solid black screen into a full visible image (fade-in). Te fade is used at te beginning/end of a film, scene, or sequence. Anoter important principle of cinematograp is obeed wen two segments of videos are joined togeter: cuts sould be unobtrusive and sustain te audience s attention on te narrative. One wa of compling wit tis principle is to avoid jump 5

cuts [15]. A jump cut is often regarded as a mistake in classical editing [17]. It usuall occurs wen two ver similar sots of te same subject (person or object) are joined togeter b a cut, producing te impression tat te subject jumps into a new pose. A jump cut produces a disorientation effect, confusing te spectators spatiall and temporall. Tere sould be a definite cange in image size and viewing angle from sot to sot. We propose te use of a similarit scale to classif te transition between two consecutive sots and make te appropriate editing decision. Firstl, given two video segments C and C, we calculate te istogram C of te last frame of C and te istogram C of te first frame of C. Ten we use a metric suc as a correlation coefficient to determine te degree of similarit between sots: F corr ( C, C ) i ( i ( C ( i) C C ( i) C ) 2 )( C ( i) C ) i ( C ( i) C ) 2 were C (i) and C (i) are te istogram values in te discrete interval i and C and C are te istogram bin averages. Hig values of correlation represent a good matc between te frames, tat is: F coor(c, C ) = 1 represent a perfect matc and F coor(c, C ) = -1 a maimal mismatc. Using te istogram correlation coefficient, we define tree classes of similarit {S 1, S 2, S 3}. Tese classes can be epressed b te following rules: If F coor (C, C ) (, 1] ten similarit class is S 1. If F coor (C, C ) [, ] ten similarit class is S 2. If F coor (C, C ) [-1, ) ten similarit class is S 3. Te similarit scale and te classes of similarit are illustrated in Figure 4. Te similarit class S 1 represents a class of ig similarit between frames and no editing procedure is required. Transitions of videos wit similarit S 1 are not noticed b te audience. Te similarit class S 2 represents a condition tat causes jump cuts. In tis case, no sot transition can be applied and te Editor agent sould use te master scene. Te similarit class S 3 represents a class of low similarit between frames. In tis case, te transition between videos can be done using a cut, dissolve, wipe or fade, witout causing jump cuts. 6

Figure 4. Similarit Scale (te values of and are eperimental). Te transitions of video segments classified in te similarit class S 3 and te recovered segments in te class S 2 lead us to te second step of te editing process: te selection of appropriated transitions. To identif te most appropriated transition to join two segments of videos, we formulate a set of rules based on te cinematograp literature to classif te video segments into te four basic classes of transitions. Considering te functions L T() and L S() tat return te temporal and spatial locations of a video segment based on its plot action cain, we define te following transition rules: C C ( L ( C ) L ( C )) ( L ( C ) L ( C )) T T T T T S S C C (( L ( C ) L ( C )) ( L ( C ) L ( C ))) ( F ( C, C ) 0.75) T T T S C C (( L ( C ) L ( C )) ( L ( C ) L ( C )) ( F ( C, C ) 0.75) T C C ( C C ) T fade S S S cut coor coor dissolve wipe Video segments classified as class T cut are read to be processed and te direct cut can be eecuted wen te end of te first video segment is reaced. Te video segments classified as class T disolve, T wipe or T fade must pass troug anoter analzer to determine te duration of te transition. In a conventional editing process, te editor usuall uses te duration of a transition to represent te temporal variation tat occurs during te transition. Based on tis idea and considering t te eibition time (usuall in minutes or seconds) and T te stor time (usuall ours or ears), te duration of te transition is given b: T ( C d, C ) t min ( LT ( C, C T ) T ma ma T )( t min ma t min ) were T ma and T min represent respectivel te maimal and minimal temporal variation in te stor time, and te variables t ma and t min represent te maimal and minimal duration of te transition. Usuall t min = 0.5 and t ma = 2.0 seconds (te minimum and maimal time for a transition, using dissolve transition as a reference). Figure 5 illustrates te process of computing te transition between two video segments using te metod above. 7

Figure 5. Eample of a transition computation in our eperimental comed sort film using Logtell [13]. Te proposed metod for automatic video editing is robust enoug to andle looping states (suc as te state 2 in Figure 1). Tese cases often fall witin te S 2 similarit range and te Editor agent etracts an alternative video of te same action from te master scene video. In tis wa, te metod can generate, for instance, a continuous figting scene tat gracefull switces between different viewpoints. Also we introduce a mecanism to guarantee variations in looping videos. Wen te same video is sowed to te viewers for more tan tree times, te Editor agent computes a stocastic cut point in te coverage video to transit to te same point in te master scene video. Tese nondeterministic transition points generate more natural scenes and, consequentl, loops in video are ardl noticed b te viewers. 5 Evaluation and Results Te capacit of te proposed metod to select appropriated transitions for video segments was evaluated troug te following strateg: we compared te results of our automatic video editing metod wit te decisions made b uman editors of wellknown movies. Firstl we analzed te initial scene of te movie Te Lord of te Rings: Te Return of te King [19]. Te scene starts wit Déagol falling in te river Anduin and finding te ring, and ends wit Frodo and Sam following Gollum troug te Vale of Morgul. Te test sequence ad approimatel 8 minutes and a total of 94 sots manuall separated in individual video files. We cose tis sequence because te scene starts in te past and graduall progresses to te present stor time, enabling te evaluation of dif- 8

ferent temporal transitions. Given te ordered sequence of sots of te movie, we computed (for eac consecutive sot) te transition between te sots (see Figure 6) and ten compared te result wit te original transition used in te movie. In tis case, tere is no need of a stor planner, because te movie is linear. In our interactive stortelling sstem, te temporal and spatial information used b te algoritm is automaticall calculated, but in tis test we manuall annotated tis information in eac sot. Te result of te test is sown in Table 1. Figure 6. Eample of a transition computation between two sots (C 79, C 80) of Te Lord of te Rings: Te Return of te King. Coprigted images reproduced under fair use polic. Table 1. Comparison of te original transitions in Te Lord of Te Rings: Te Return of Te King wit transitions from our metod. Transitions Cut Dissolve Wipe Fade Original 86 6 0 2 Our Metod 84 7 0 2 Hits Rate 97.6% 85.7% 100% 100% We found two transitions tat do not matc wit te original transitions. Te first one is a cut classified b te Editor agent as a dissolve. Analzing te video segments it is difficult to justif w te actual editor coose a cut, because it is clear tat te sots occur in different times and te cut causes some disorientation in te audience. Tis disorientation does not occur using a dissolve transition. In te oter mismatc, te Editor agent classified te transition as a jump cut. Indeed, visuall analzing te film we see tat tere is a jump cut in a ver sort figting scene. Actuall we cannot affirm tat te uman editor is wrong in tese cases, because eac editor as is/er own stle. As a second evaluation test, we selected a movie b anoter director in a different film genre: te classic Psco, directed b Alfred Hitccock [20]. We analzed te last scene of te film. Te scene starts wit Lila going to investigate Mrs. Bates ouse and 9

finises in te end of te film, wen Mar's car is pulled out of te swamp. Te test sequence ad approimatel 14 minutes and a total of 153 sots manuall separated in individual video files. Te results of tis test are sown in Table 2. Transitions Cut Dissolve Wipe Fade Original 150 2 0 1 Our Metod 150 2 0 1 Hits Rate 100% 100% 100% 100% Te automated video editing metod presented in tis paper does not involve comple computing tasks; owever, its computing compleit grows according to te resolution of te analzed video segments. To ceck te performance of te proposed metod we test its eecution in te most common video resolutions. For eac resolution, we compute te necessar time to get te image frames, compute te istograms, calculate te istogram correlation and classif te transition. Tis process was eecuted in a sequence of 40 video segments and te average time was computed for eac resolution. We ran tis test in an Intel Core i7 2.66 GHZ CPU, 8 GB of RAM using a single core. Te results of te performance tests are sown in Table 3. Resolution 704480 1280720 19201080 Time(ms) 21.268 29.455 59.170 6 Conclusions In tis paper we propose a real-time metod for automatic video editing to be used in video-based interactive stortelling sstems. Our approac is based in translating cinematograp editing teor directl into logical rules to automaticall select appropriated videos transitions, avoid jump cuts, and create smoot and continuous video sequences. Te evaluation and performance tests revealed te ig qualit of te proposed metod. As a furter work we are planning some improvements in our video-based interactive stortelling sstem, suc as te use of video compositing tecniques to manipulate te video segments and compose te scene elements in real-time. Anoter future work is te use of macine learning tecniques for te implementation of different editing stles. Acknowledgments Te autors would like to tank CAPES and CNPq for giving financial support to teir researc projects. 10

References [1] Cavazza, M.; Carles, F.; Mead, S. Caracter-based interactive stortelling. IEEE Intelligent sstems, vol. 17, no. 4, pp. 17-24, 2002. [2] Mateas, M.; Stern, A. Structuring content in te facade interactive drama arcitecture. In Proceedings of te Artificial intelligence and interactive digital entertainment conference, pp. 93-98, 2005. [3] Ciarlini, A., Pozzer, C., Furtado, A., Feijó, B., 2005. A logic-based tool for interactive generation and dramatization of stories. In Proceedings of te international conference on advances in computer entertainment tecnolog, Valencia, pp. 133-140. [4] Aquarium (Akvaario). Media Lab, Helsinki Universit of Art and Design Finland. Broadcast b Te Finnis Broadcasting Compan, 2000. [5] Porteous, J.; Benini, S.; Canini, L.; Carles, F.; Cavazza, M.; Leonardi, R. Interactive stortelling via video content recombination. In Proceedings of te international conference on Multimedia 2010, Firenze, Ital, pp. 1715-1718, 2010. [6] Ursu, M.F.; Kegel, I.C.; Williams, D.; Tomas, M.; Maer, H.; Zsombori, V.; Uomola, M.L.; Larsson, H.; Wver, J. SapeSifting TV: Interactive screen media narratives. Multimedia Sstems, vol. 14, no. 2, pp. 115-132, 2008. [7] Bocconi, S.; Nack, F.; Hardman, L. Using Retorical Annotations for Generating Video Documentaries. In Proceedings of te IEEE International Conference on Multimedia and Epo, Jul 2005. [8] Cua, T.S.; Ruan, L.Q. A Video Retrieval and Sequencing Sstem. ACM Transactions on Information Sstems, vol. 13, no. 4, pp. 373-407, 1995. [9] Aanger, G.; Little, T.D.C. Automatic composition tecniques for video production. IEEE Transactions on Knowledge and Data Engineering, vol. 10, no. 6, pp. 967-987, 1998. [10] Nack, F.; Parkes, A. Te Application of Video Semantics and Teme Representation in Automated Video Editing. Multimedia Tools and Applications, vol. 4, no. I, pp. 57-83, 1997. [11] Sen, Y.T.; Lieberman, H.; Davenport, G. Wat s Net? Emergent Stortelling from Video Collections. In Proceeding of International Conference on Human factors in computing sstems, pp. 809-818, 2009. [12] Ma, Y.F.; Lu, L.; Zang, H.J.; Li, M.J. A User Attention Model for Video Summarization. In Proceeding of ACM Multimedia 2002, pp. 533-542, 2002. [13] Logtell Project Web Site: ttp://www.icad.puc-rio.br/~logtell/ [14] Grasbon, D.; Braun, N. A morpological approac to interactive stortelling. In Proceedings of Cast01, Living in Mied Realities, Sankt Augustin, German, pp. 337-340, 2001. [15] Mascelli, J. Te Five C's of Cinematograp: Motion Picture Filming Tecniques, Los Angeles, US: Silman-James Press, 1965. [16] Brown, B. Cinematograp: Teor and Practice: Image Making for Cinematograpers, Directors, and Videograpers. Burlington, US: Focal Press, 2002. [17] Butler, J.G. Television: critical metods and applications. New Jerse, US: Lawrence Erlbaum Associates Publisers, 2002. [18] Tompson, R. Grammar of te Edit. Oford, Focal Press, 1993. [19] New Line Cinema, Te Lord of te Rings: Te Return of te King, 2003. [20] Universal Studios, Psco, 1960. 11