Audio/Video Synchronization Standards and Solutions A Status Report. Patrick Waddell/Graham Jones/Adam Goldberg



Similar documents
Audio and Video Synchronization:

Migrating Live Production to IP Technology. It s About Time

For Articulation Purpose Only

How To Test Video Quality With Real Time Monitor

Signal Quality in HDTV Production and Broadcast Services

High Dynamic Range Video The Future of TV Viewing Experience

FREE TV AUSTRALIA OPERATIONAL PRACTICE OP42

MPEG-H Audio System for Broadcasting

Implementing Closed Captioning for DTV

Dolby Vision for the Home

Datasheet EdgeVision

TV 2 AS HD DELIVERY FOR TV 2 AS

ATSC Digital Television Standard: Part 4 MPEG-2 Video System Characteristics

Case Study: Real-Time Video Quality Monitoring Explored

FREE TV AUSTRALIA OPERATIONAL PRACTICE OP- 59 Measurement and Management of Loudness in Soundtracks for Television Broadcasting

Copyright 2006 TeleMídia

CITY AND COUNTY OF SAN FRANCISCO SAN FRANCISCO GOVERNMENT TELEVISION CITY HALL DIGITAL UPGRADE PROJECT PHASE II SCOPE OF WORK

Optimizing BrightSign Video Quality

TECHNICAL OPERATING SPECIFICATIONS

Radio TV Forum 2014 Rohde & Schwarz Solutions

A Broadcasters Guide to PSIP

Implementing Closed Captioning for DTV

SDI TO HD-SDI HDTV UP-CONVERTER BC-D2300U. 1080i/720p HDTV Up-Converter with built-in Audio/Video processor

Technical Standards and Requirements for Radio Apparatus Capable of Receiving Television Broadcasting

1-MINIMUM REQUIREMENT SPECIFICATIONS FOR DVB-T SET-TOP-BOXES RECEIVERS (STB) FOR SDTV

5.1 audio. How to get on-air with. Broadcasting in stereo. the Dolby "5.1 Cookbook" for broadcasters. Tony Spath Dolby Laboratories, Inc.

DELIVERING CAPTIONS IN DTV An NCAM DTV Access Brief

ADVANTAGES OF AV OVER IP. EMCORE Corporation

High Definition (HD) Technology and its Impact. on Videoconferencing F770-64

Waveform Monitor Audio & Loudness Logging

Technical Paper. Dolby Digital Plus Audio Coding

Measurements on MPEG2 and DVB-T signals (1)

4K End-to-End. Hugo GAGGIONI. Yasuhiko MIKAMI. Senior Manager Product Planning B2B Solution Business Group

Understanding Video Latency What is video latency and why do we care about it?

PackeTV Mobile. solutions- inc.

. ImagePRO. ImagePRO-SDI. ImagePRO-HD. ImagePRO TM. Multi-format image processor line

An Introduction to High Dynamic Range (HDR) and Its Support within the H.265/HEVC Standard Extensions

Applications that Benefit from IPv6

Parameter values for the HDTV standards for production and international programme exchange

Proactive Video Assurance through QoE and QoS Correlation

A Brief on Visual Acuity and the Impact on Bandwidth Requirements

ULTRA HIGH DEFINITION TV OVER IP NETWORKS

HIGH-DEFINITION: THE EVOLUTION OF VIDEO CONFERENCING

SDI SDI SDI. Frame Synchronizer. Figure 1. Typical Example of a Professional Broadcast Video

Cable TV Headend Solutions

The Broadcaster s Guide to SMPTE 2022: Applications in Video Contribution and Distribution

White paper. An explanation of video compression techniques.

FRAUNHOFER INSTITUTE FOR INTEg RATEd CIRCUITS IIS. drm TesT equipment

IsumaTV. Media Player Setup Manual COOP Cable System. Media Player

Common 16:9 or 4:3 aspect ratio digital television reference test pattern

Best practices for producing quality digital video files

Selenio Media Convergence Platform

Ultra-High Definition TV Acquisition and Exchange

ATSC Recommended Practice: Techniques for Establishing and Maintaining Audio Loudness for Digital Television

Conference interpreting with information and communication technologies experiences from the European Commission DG Interpretation

(Refer Slide Time: 4:45)

White Paper on Video Wall Display Technology in Videoconferencing HUAWEI TECHNOLOGIES CO., LTD. Issue 01. Date

Tresent Technologies IPQ User report

Video Conferencing Glossary of Terms

RECOMMENDATION ITU-R BR *, ** Analogue composite television tape recording

R 95 SAFE AREAS FOR 16:9 TELEVISION PRODUCTION REVISION 1 SOURCE: SP-BHD, SP-QC

Session 4. Market & Business Development in mobile TV, satellite TV, Cable TV, IP TV, etc.

Continuity Monitoring of Audio, Video and Data in a Multi-Channel Facility David Strachan Evertz Microsystems, Ltd.

Testing Video Transport Streams Using Templates

INFORMATION TECHNOLOGY - GENERIC CODING OF MOVING PICTURES AND ASSOCIATED AUDIO: SYSTEMS Recommendation H.222.0

Understanding Network Video Security Systems

The Evolution of Video Conferencing: UltimateHD

Convention Paper 7896

Subtitles are a valuable piece of the TV broadcast and streaming

SERIES H: AUDIOVISUAL AND MULTIMEDIA SYSTEMS Infrastructure of audiovisual services Transmission multiplexing and synchronization

QOS Requirements and Service Level Agreements. LECTURE 4 Lecturer: Associate Professor A.S. Eremenko

The Picture must be Clear. IPTV Quality of Experience

Workflow Comparison: Time Tailor versus Non-Linear Editing Systems WHITE PAPER

Simple Voice over IP (VoIP) Implementation

reach a younger audience and to attract the next-generation PEG broadcasters.

ITU-T G Quality of experience requirements for IPTV services

BROADCASTING ACT (CHAPTER 28) Code of Practice for Television Broadcast Standards

High Definition (HD) Image Formats for Television Production

Avid DNxHD Technology

BROADCAST SMPTE ST Introduction

PERFORMANCE ANALYSIS OF VIDEO FORMATS ENCODING IN CLOUD ENVIRONMENT

Multichannel stereophonic sound system with and without accompanying picture

Emerging Markets for H.264 Video Encoding

FREE TV AUSTRALIA OPERATIONAL PRACTICE OP 60 Multi-Channel Sound Track Down-Mix and Up-Mix Draft Issue 1 April 2012 Page 1 of 6

Chapter 3 ATM and Multimedia Traffic

RECOMMENDATION ITU-R BR Analogue composite television tape recording

ULTRA HIGH DEFINITION MARKET OUTLOOK AND NEXT STEPS

High Definition Broadcast Server BROADCAST

White paper. HDTV (High Definition Television) and video surveillance

Live Web Broadcasting of Library Board Meetings - Update

IPTV and its transportation...

Next Generation DTV: ATSC 3.0

Multichannel audio: From studio to listener

3Gb/s SDI for Transport of 1080p50/60, 3D, UHDTV1 / 4k and Beyond Part 1 - Standards 2013 SMPTE

Transcription:

/Video Synchronization Standards and Solutions A Status Report Patrick Waddell/Graham Jones/Adam Goldberg

ITU-R BT.1359-1 (1998) Only International Standard on A/V Sync Subjective study with EXPERT viewers SDTV not HDTV images CRT displays, of course At first glance it seems loose: +90 ms to -185 ms as a Window of Acceptability In their terms, positive values are audio advanced relative to video, negative is delayed relative to video We will examine these results more closely The numbers were statistically significant for each point Remember, the measurements were very carefully made Expert viewers 20 CRT monitors fixed viewing distances 2

ITU-R BT.1359 Figure 2 0 C Undetectablity plateau C' Subjective evaluation results (Diffgrade) -0.5-1.0-1.5 A B Sound delayed wrt vision Detectability threshold Acceptability threshold B' Sound advanced wrt vision A' -200 ms -150 ms -100 ms -50 ms 0 ms +50 ms +100 ms ITU-R BT.1359 Figure 2 3

ITU-R BT.1359 Figure 2 Let s quickly look at Figure 2 versus Fixed Pixel Display rates 30/1.001 Hz (or 33.3 ms per image) 25 Hz (or 40 ms per image) This may be informative

Figure 2 with Fixed Pixel Display Timings Shown Subjective evaluation results (Diffgrade) 5

Figure 2 with Fixed Pixel Display Timings Shown 25 Hz Frame Times (40 ms) shown 0 C Undetectablity plateau C' -0.5 B Detectability threshold B' -1.0 A A' -1.5 Sound delayed wrt vision Acceptability threshold Sound advanced wrt vision -200 ms -150 ms -100 ms -50 ms 0 ms +50 ms +100 ms ITU-R BT.1359 Figure 2 6

Fixed Pixel Display Timings Interesting results Note that both charts assumed interlaced video So 1080P/60 or 1080P/50 display times are half that shown The measured values with CRTs line up fairly well with FPM times for detectability Most of the ITU study measurements were with 25 Hz video (except the Japanese, who used 30 Hz) Note that the Acceptance threshold is merely 2 frames advanced for either frame rate! Our brains are used to sound being delayed in nature (by distance) Our brains are confused when sound precedes the vision! 7

Lip Sync is an End-to-End Issue 1 1' 1'' Outside Broadcast 2 Codec Simplified Reference Chain for television sound/vision timing from ITU-R BT.1359 1998 Contribution 1 1' 1'' 2 2' 3 3' 4 4' 5 Studio Codec Contribution Compilation Station (1) Codec Distribution Local Station (1) Emission Codec STL Local transmitter Undetectable from -100 ms to 5' +25 ms Detectable at -125 ms & +45 ms Becomes unacceptable at -185 ms & +90 ms 6 6' Sound delayed + Sound advanced (1) 8

Subjective Tests Subjective tests for the ITU-R BT.1359 standard were carried out in Australia, Japan and Switzerland in 1995 and 1996 Used PAL and NTSC video Tube cameras, 22 CRT displays 6x picture height New tests carried out this year by JEITA in Japan HD, CCD cameras, large flat panel displays, 3x picture height Results to be published later this year Will possibly show lower threshold levels ITU standard may need to be revised?? 9

ITU-R BT.1359 Thresholds Undetectable from -100 ms to +25 ms Detectable at -125 ms at & +45 ms Becomes unacceptable at -185 ms & +90 ms 10

Recommended Tolerances At the input to the transmitter/emission encoder ITU BT.1359 1998 ATSC IS/191 2003 EBU R37 2007-30 ms +22.5 ms -45 ms +15 ms -60 ms +40 ms Sound delayed + Sound advanced Undetectable from -100 ms to +25 ms Detectable at -125 ms at & +45 ms Becomes unacceptable at ATSC and EBU tolerances are for absolute -185 ms A/V & +90 timing ms errors ITU tolerance is for the A/V timing difference in the path from the output of the final program source selection element to the input to the transmitter for emission 11

Link Budget 1 1' 1'' 2 Outside Broadcast -100 ms to +25 ms Codec Contribution 1 1' 1'' 2 2' 3 3' 4 4' 5 Studio Reference point Undetectable from 5' -100 ms to +25 ms ` Codec Contribution Detectable at -125 ms at & +45 ms Becomes unacceptable at -185 ms & +90 ms Compilation Station (1) 6 6' Codec Distribution Local Station (1) Emission STL Codec -30 ms +22.5 ms -100 +25 Local transmitter -130 ms +47.5 ms (1) 12

Broadcaster Tolerance Given the level of uncertainty of A/V sync coming out of production and the: Variability of consumer devices Variability in viewing conditions In order to have reasonable expectation that viewers will see acceptable lip sync: The broadcaster has no choice but to target a very low or zero error through the chain from reference point to emission encoder There is little or no spare budget to allocate! 13

Correct Sync Errors Where they Occur 14 Good system design can correct for known and predictable differential delays Solid state cameras Frame synchronizers Vision switchers, format converters, etc. Flat panel monitors with associated audio monitoring Fixed and variable delay compensation Available from various manufacturers Control signals from some video devices allow automatic delay switching Care needed to avoid audio artifacts Some errors in the chain cannot be predicted or corrected automatically where they occur

Out of Service Measurement Clapper board Electronic clapper boards Beep-flash systems Sarnoff Visualizer 15

In Service Measurement Pixel Instruments LipTracker Asaca TuLips Both use sophisticated analysis of lip movements and associated audio sounds to establish an absolute measurement of sync error at any point in the chain Applicable when moving lips are clearly visible May not be very practical for real world broadcast systems 16

What Is Needed? A dynamic in-service method that can respond in near real time Works while content is playing - not a calibration method Not reliant on any specific signal format or interface so it can be carried through all the different parts of the entire signal chain Particularly needed for the professional parts of the delivery chain Possible application for consumer devices 17

A/V / Fingerprint / DNA Extract features from both audio and video and combine together in an independent data stream Use fingerprinting methods that are resilient to processing of the audio and video signals Designed to allow typical types of processing (data rate compression, format changes, etc.) This data stream may be called an A/V Sync, Fingerprint, or DNA Relies on generating the signature at a point where A/V sync is known to be correct From that point on the system is designed to measure and maintain the relative audio/video timing that was present when the signature was generated 18

A/V Synchronization Video Frames (e.g. 33.3 msec) Video Video Video Video Video Video Video Blocks (e.g. 10 msec) 19 Slide courtesy of Dolby

A/V Sync Comparison Sent in A/V Sync i Sig_R a delay delay Compare s delay A a A v Video delay i i Extract Extract Video Video delay i i Sig_A a Sig_A v Compare Delays A/V Sync Delay and Video Unknown Sync Compare s Video delay Sent in A/V Sync i Sig_R v Difference between audio delay and video delay is the A/V sync error 20 Slide courtesy of Dolby

A/V Sync Correction Dolby A/V Real-Time System 21 Slide courtesy of Dolby

A/V Sync Correction File Server Content Distribution Network File Server A/V file A/V file Adjust A/V file sync as necessary A/V sync signature Variable File Processing A/V sync signature Extract & Video s and Video are known to be in sync Unknown A/V sync Extract & Video s Generate A/V Sync Comparisons Meter Display A/V Sync Generator Software 22 Dolby A/V File-based System Slide courtesy of Dolby A/V Sync Detection/ Correction Software

Broadcast Chain 1 1' 1'' 2 Outside Broadcast Codec Contribution With a fingerprint system, all errors occurring after the reference point can be measured and corrected prior to encoding for emission 1 1' 1'' 2 2' 3 3' 4 4' 5 Studio Reference point Codec Contribution Compilation Station (1) Codec Distribution Local Station (1) Emission STL Codec Local transmitter If adopted by consumer 5' devices, the same fingerprint from the reference point could possibly be used to correct errors at the point of display 6 6' (1) 23

Products/ Technologies Evertz IntelliTrak Miranda Densite HLP-1801 Sigma Electronics Arbalest K-Will QuMax 2000 Dolby A-V All use A-V signature / DNA / fingerprint metadata All assume correct sync at the input reference point All measure errors at downstream point, enabling errors to be corrected automatically 24

A Standardized Fingerprint? Entire program chain usually not under control of broadcaster From user s perspective, it is highly desirable for equipment from different manufacturers in different parts of the chain to interoperate Is standardized fingerprint metadata for A-V sync the solution? Standardized transport methods? Seeking input from broadcasters and users on what they want from manufacturers 25

SMPTE 22TV Standards Work A-V Sync Measurement and Assessment Project scope: Define recommended techniques for audio-video synchronization error measurement, and techniques and environment for synchronization assessment Specific tasks: Determine requirements for consistent out-of-service measurements and inservice assessments and measurements of audiovisual synchronization errors, as may be necessary and practical. 26

DTV Receivers 1 1' 1'' Outside Broadcast 2 Codec Simplified Reference Chain for television sound/vision timing from ITU-R BT.1359 1998 Contribution 1 1' 1'' 2 2' 3 3' 4 4' 5 Studio Codec Contribution Compilation Station (1) Codec Distribution Local Station (1) Emission Codec STL Local transmitter 5' 6 6' (1) 27

28 CEA-CEB20

CEA-CEB20 A/V Synchronization Processing outlines the steps that an MPEG decoder should take to ensure and maintain audio/video synchronization. Such synchronization is necessary for end-viewer satisfaction. Written assuming the reader has a fundamental understanding of MPEG-2 Systems, but not of real world conditions 29

Real-world Conditions Why is this important? Designers often are not aware of the types of input disruptions that are common and the consequences of those to decoding Designers forget seemingly obvious things, such as PCR wrap-around Designers may not understand the importance of frequent cross-checking of clock samples between separate audio and video decoder ICs 30

Real-world Conditions The industry continues to see new entrants into the decoder market Both for professional as well as home use Even experienced engineers (with traditional video/audio backgrounds) make horrible assumptions about MPEG While CEB20 will assist, it cannot be regarded as a panacea 31

CEB20 Major Topics Receiver Architecture Model Decoder Clock Startup and Maintenance Presentation Time Processing Advanced Transport Stream Processing for Recording or Remote Playback Carriage of MPEG-2 TS over IP networks 32

33 Receiver Hardware Reference Model

Receiver Architecture Model Demultiplexer PCR Assist How the demux hardware can assist keeping clocks accurate Decoder Clock Hardware for buffer management Identifies issues with variance in buffer sizes between SDOs (DVB vs. ATSC/SCTE) Discusses maintenance of A/V sync at a high level and Video Output Clocks 34

Decoder Clock Startup and Maintenance Startup Disturbances to the MPEG Transport Stream Major Adjustments System Time-Base Discontinuity Recommended Decoder Clock Error Event Recovery Method Minor Adjustments 35

Presentation Time Processing Startup Practical Considerations This is a key area and needs attention paid to it Adjustments Major Adjustments 36

Advanced Transport Stream Processing for Recording or Remote Playback Partial Transport Stream Recording Recovery of SPTS from MPTS Clock maintenance in such a situation Maintaining Inter-packet Timing Relationships During Playback of Recorded Content Critical for recovered SPTS Pointers to two documented methods of doing this 37

THANK YOU 38