MEDIA SYNCHRONIZATION STANDARDISATION AT W3C Jack Jansen, CWI
MEDIA SYNCHRONIZATION STANDARDISATION - THE MISSING BITS Jack Jansen, CWI
OVERVIEW Why bother with synchronization? How should we tackle it? Prior work (Includes W3C standardization) Work in progress at CWI
WHY BOTHER? For the benefit of the end user! How much desync is a problem? How to resync? Who should set synchronization policy? Author (of presentation, game,...) End user?... not synchronization implementor, protocol designer,...
DESYNC BOUNDS Sometimes you want hard synchronization Lip-sync, Karaoke Sometimes you want no synchronization Background music Sometimes you want something in-between Slideshow commentaries Corollary: depends on context, media content
RESYNC POLICY Simple-minded Pause when ahead, skip when behind Media dependent Skip/insert at silence pitch-maintaining speedup/slowdown Context dependent Music video vs. diagram commentary
DISTRIBUTED SYNC All the normal distributed system problems Hard sync physically impossible in the general case Speed of light Multiple users in the loop Leads to causality issues Cheering before the goal
AWARENESS PROBLEM General public thinks more tech will solve the issue Skype Visionary video showing people in Europe and California playing together live: http://youtu.be/vqzyudtn0cq Great circle distance is 9000 Km, 33 ms. One-way...
TASK BREAKDOWN 1) What is good (or good enough) synchronization? 2) How do we model good synchronization? 3) How do we detect bad synchronization? 4) How do we fix bad synchronization? 5) Evangelism
CURRENT PRACTICE
IMPLEMENTATIONS Ignore Synchronization Media playback in video chat Half-assed Best Effort Lip-sync in video chat Rigorous Hard Synchronization Broadcast Corollary: pretty much ignores context
SPECIFICATIONS Most formats are silent about semantics define what perfect sync would be RTP/RTCP define how to measure desync allows subgrouping of streams (SSRC) SMIL 2.0 allows subgrouping and specifying sync granularity... but no way to specify semantics of how to fix things
SMIL IN A NUTSHELL SMIL references media, does not contain it Model: hierarchical composition Provides sequential, parallel composition and selection <par> <audio src="background.mp3"/> <seq> <par> <video src="slide1.m4v"/> <text src="caption1.txt"/> <audio systemlanguage="nl" src="nl-voiceover1.mp3"/> </par>... </seq> </par>
SYNC CONTROL Works with hierarchical containment grouping model Most interesting within a <par> syncmaster= true sets timing master in group syncbehavior= locked locks to timing master synctolerance= 1s loosens locking syncbehavior= independent decouples from master syncbehavior= canslip locked, but very loosely
SYNC CONTROL - 2 That was the Good... The Bad: Resync behavior explicitly implementation-defined The Ugly: The real SMIL spec is even more baroque than sketched here...
WORK IN PROGRESS
REMOTE SYNC Master Project by Shahab Ud Din Goal: experiment with distributed document-based synchronized playback No single source of media and timing High bandwidth, low bandwidth, slideshow Need to cater for interaction, navigation Platform to try algorithms in document-based setting
REMOTE SYNC - PLANS Include live audio/video chat streams Sync to live chat streams Change sync behaviour based on context Lecture Shared experience Causality-based would be fun... User testing
MULTITOUCH SYNC Master Project by Jan-Willem Kleinrouweler Goal: allow multitouch events to be shared Questions: Sync to audio/video streams? How to resync if events fall behind?
MULTITOUCH SYNC - 2 Inter-finger sync? Synced to audio/video? Can we jump to destination? Use a shortcut? Do fast-forward? All application dependent!
FUTURE WORK Use distributed sync platform to investigate what good synchronization means in various settings Use both platforms to determine how to fix bad synchronization Come up with a model that decouples policy and implementation Make policy-owners aware of the issues Integrate with HTML5, DASH, WebRTC,...
SLIDES ONLINE