1 Universidad Politécnica de Madrid Escuela Técnica Superior de Ingenieros de Telecomunicación ANALYSIS, MONITORING, AND MANAGEMENT OF QUALITY OF EXPERIENCE IN VIDEO DELIVERY SERVICES OVER IP Tesis Doctoral Pablo Pérez García Ingeniero de Telecomunicación 2013
3 Universidad Politécnica de Madrid Departamento de Señales, Sistemas y Radiocomunicaciones Escuela Técnica Superior de Ingenieros de Telecomunicación Tesis Doctoral ANALYSIS, MONITORING, AND MANAGEMENT OF QUALITY OF EXPERIENCE IN VIDEO DELIVERY SERVICES OVER IP Autor: Pablo Pérez García Ingeniero de Telecomunicación Director: Narciso García Santos Doctor Ingeniero de Telecomunicación 2013
5 Tesis Doctoral ANALYSIS, MONITORING, AND MANAGEMENT OF QUALITY OF EXPERIENCE IN VIDEO DELIVERY SERVICES OVER IP Autor: Pablo Pérez García Director: Narciso García Santos Tribunal nombrado por el Mfgco. y Excmo. Sr. Rector de la Universidad Politécnica de Madrid, el día de de Presidente: Vocal: Vocal: Vocal: Secretario: Realizado el acto de defensa y lectura de la Tesis el día de de 2013 en Calificación: EL PRESIDENTE LOS VOCALES EL SECRETARIO
7 If you make listening and observation your occupation you will gain much more than you can by talk. Robert Baden-Powell
9 UNIVERSIDAD POLITÉCNICA DE MADRID Abstract TESIS DOCTORAL ANALYSIS, MONITORING, AND MANAGEMENT OF QUALITY OF EXPERIENCE IN VIDEO DELIVERY SERVICES OVER IP by Pablo Pérez García This thesis proposes a comprehensive approach to the monitoring and management of Quality of Experience (QoE) in multimedia delivery services over IP. It addresses the problem of preventing, detecting, measuring, and reacting to QoE degradations, under the constraints of a service provider: the solution must scale for a wide IP network delivering individual media streams to thousands of users. The solution proposed for the monitoring is called QuEM (Qualitative Experience Monitoring). It is based on the detection of degradations in the network Quality of Service (packet losses, bandwidth drops... ) and the mapping of each degradation event to a qualitative description of its effect in the perceived Quality of Experience (audio mutes, video artifacts... ). This mapping is based on the analysis of the transport and Network Abstraction Layer information of the coded stream, and allows a good characterization of the most relevant defects that exist in this kind of services: screen freezing, macroblocking, audio mutes, video quality drops, delay issues, and service outages. The results have been validated by subjective quality assessment tests. The methodology used for those test has also been designed to mimic as much as possible the conditions of a real user of those services: the impairments to evaluate are introduced randomly in the middle of a continuous video stream. Based on the monitoring solution, several applications have been proposed as well: an unequal error protection system which provides higher protection to the parts of the stream which are more critical for the QoE, a solution which applies the same principles to minimize the impact of incomplete segment downloads in HTTP Adaptive Streaming, and a selective scrambling algorithm which ciphers only the most sensitive parts of the media stream. A fast channel change application is also presented, as well as a discussion about how to apply the previous results and concepts in a 3D video scenario.
11 UNIVERSIDAD POLITÉCNICA DE MADRID Resumen TESIS DOCTORAL ANALYSIS, MONITORING, AND MANAGEMENT OF QUALITY OF EXPERIENCE IN VIDEO DELIVERY SERVICES OVER IP por Pablo Pérez García Esta tesis estudia la monitorización y gestión de la Calidad de Experiencia (QoE) en los servicios de distribución de vídeo sobre IP. Aborda el problema de cómo prevenir, detectar, medir y reaccionar a las degradaciones de la QoE desde la perspectiva de un proveedor de servicios: la solución debe ser escalable para una red IP extensa que entregue flujos individuales a miles de usuarios simultáneamente. La solución de monitorización propuesta se ha denominado QuEM (Qualitative Experience Monitoring, o Monitorización Cualitativa de la Experiencia). Se basa en la detección de las degradaciones de la calidad de servicio de red (pérdidas de paquetes, disminuciones abruptas del ancho de banda... ) e inferir de cada una una descripción cualitativa de su efecto en la Calidad de Experiencia percibida (silencios, defectos en el vídeo... ). Este análisis se apoya en la información de transporte y de la capa de abstracción de red de los flujos codificados, y permite caracterizar los defectos más relevantes que se observan en este tipo de servicios: congelaciones, efecto de cuadros, silencios, pérdida de calidad del vídeo, retardos e interrupciones en el servicio. Los resultados se han validado mediante pruebas de calidad subjetiva. La metodología usada en esas pruebas se ha desarrollado a su vez para imitar lo más posible las condiciones de visualización de un usuario de este tipo de servicios: los defectos que se evalúan se introducen de forma aleatoria en medio de una secuencia de vídeo continua. Se han propuesto también algunas aplicaciones basadas en la solución de monitorización: un sistema de protección desigual frente a errores que ofrece más protección a las partes del vídeo más sensibles a pérdidas, una solución para minimizar el impacto de la interrupción de la descarga de segmentos de Streaming Adaptativo sobre HTTP, y un sistema de cifrado selectivo que encripta únicamente las partes del vídeo más sensibles. También se ha presentado una solución de cambio rápido de canal, así como el análisis de la aplicabilidad de los resultados anteriores a un escenario de vídeo en 3D.
13 Acknowledgements This thesis would not have been possible without the help of all the people with whom I have been so lucky to share my way in these more than eight years. Let me express my gratitude to all of them in my mother tongue. La vida es un conjunto de relaciones; y enumerar todas las que se pueden forjar en los ocho años que ha durado este trabajo ocuparía más espacio del que, probablemente, sea razonable dedicar en una tesis doctoral. De modo que es probable que esté siendo injusto con algunas personas que, por descuido, olvido, o falta de espacio, no aparecerán aquí citadas. Vaya de antemano mi disculpa (y agradecimiento) también para ellas. Gracias ante todo a Narciso García, que sigue logrando sacar huecos en su cada vez más complicada agenda para acompañarme en esta aventura. Es un privilegio contar con él como director de tesis. Gracias también, muy especialmente, a Jaime Ruiz, que ha sido mucho más que un manager en estos ocho años. No exagero si digo que, si no fuera por él, difícilmente podría yo haber terminado este trabajo. Gracias al excepcional equipo humano y profesional con el que he tenido la suerte de trabajar a lo largo de estos años en Telefónica I+D y Alcatel-Lucent. A Jesús Macías, que me enseñó a mirar el vídeo de otra manera. A Álvaro Villegas, en cuyo trabajo se apoya buena parte del mío. A Silvia Varela, por ayudarme a encontrar el enfoque de este espinoso asunto de la calidad. A Enrique Estalayo y José M. Cubero, con los que he compartido tanto en tantos proyectos. A Ernesto Puerta, por las conversaciones sobre cuantificación y otros asuntos arcanos. A Javier López Poncela, por guiarme por los entresijos de los descodificadores. Gracias también a la gente del Grupo de Tratamiento de Imágenes, que me ha seguido acogiendo como en casa durante todos estos años. Muy en particular a Jesús Gutiérrez, por todo el trabajo de las pruebas de calidad subjetiva: sin él, acabar esta tesis habría resultado mucho más difícil. Gracias también a Julián Cabrera y Fernando Jaureguizar, siempre dispuestos a echar una mano en lo que hiciera falta. Mi sincero agradecimiento a todas aquellas personas que, a lo largo de estos años, han puesto también su granito de arena en esta tesis. A Juan Casal, por compartir su experiencia sobre codificación de vídeo. A Rocío Bravo, por la ayuda con las audiencias de televisión. A todos los socios del CENIT VISION, donde se gestó buena parte de la investigación que ahora presento. xiii
14 Finalmente, muchas gracias a mi familia y amigos. A mis hermanos Lucas y David, que marcaron el camino a seguir. A mi hermano Jesús, de quien he aprendido lo poco que sé de audio digital (y algún que otro truco de televisión). A mi madre Teresa, que tanto ha puesto de su parte para empujarme a terminar la tesis. A mi padre Juan, a quien seguro que le habría gustado verla acabada, y con quien también he discutido alguna de las ecuaciones que en ella aparecen. Y a Graciela, por todo lo que hemos compartido, y lo que queda por venir; tanto, que no se puede resumir en una frase. Gracias, en definitiva, a todos los que han hecho posible que esta tesis se haya escrito. Aun de aquellos que, por la falta de espacio, no he tenido ocasión de mencionar en estas líneas, guardo un buen recuerdo en el corazón. Gracias a ti, que te estás tomando el trabajo de leer estas páginas. Y gracias a Dios por habernos puesto en contacto.
15 Contents Abstract Resumen ix xi Acknowledgements xiii List of Figures List of Tables Abbreviations xix xxi xxiii 1 Introduction Motivation Overview Understanding Quality of Experience Quality of Experience and its relatives A word about multimedia services Players Coding standards and transport protocols Artifacts Who is who in the QoE metrics Subjective quality assessment Full-Reference quality metrics Reduced-Reference quality metrics No-Reference quality metrics Other topics related to QoE in IPTV services Media formats in IPTV deployments Conclusions Designing QoE-Aware Multimedia Delivery Services Introduction Delivering multimedia over IP Architecture of a multimedia service delivery platform Impairing the Quality of Experience xv
16 xvi CONTENTS 3.3 QuEM: a qualitative approach to QoE monitoring Problem statement System design Qualitative Impairment Detectors Severity Transfer Function A Subjective Assessment methodology to calibrate Quality Impairment Detectors Design principles Test methodology Selection of impairments QoE enablers Headend metadata architecture Intelligent Packet Rewrapper Edge Servers for IPTV and OTT Conclusions Quality Impairment Detectors Introduction Video Packet Loss Effect Prediction (PLEP) model Description of the model Experiment Subjective analysis Audio packet loss effect Objective analysis Subjective analysis Coding quality and rate forced drops Analysis of feature-based RR/NR metrics as estimators of video coding quality Managing coding quality drops Outages Detection of outages Subjective impact of outages Latency Lag Channel Change time Latency trade-offs Mapping to Severity Conclusions Applications Introduction Unequal Error Protection Priority Model Experimentation and results Applications Fine-grain segmenting for HTTP adaptive streaming Description of the solution
17 CONTENTS xvii 5.4 Selective Scrambling Problem statement and requirements Algorithms Results Fast Channel Change Application to 3D Video Conclusions 123 A Experimental setup 127 A.1 Introduction A.2 Subjective Assessment based on QuEM approach A.2.1 Selection and preparation of content A.2.2 Selection of impairments A.2.3 Test sessions A.3 Subjective quality assessment of H.264 video encoders A.4 Test sequences from IPTV deployments Bibliography 137
19 List of Figures 2.1 Layer and domain model for multimedia services Protocol stack for multimedia services over IP Models for objective quality assessment: FR/RR/NR Hierarchical GOP structure Network architecture for IPTV and OTT services Delivery chain of a multimedia service QuEM architecture design Test sequences in ACR Test sequences in our proposed method Questionnaire for subjective assessment tests Structure of the content streams in the subjective assessment test session Schematic representation of a modular headend RTP header and extension introduced by the rewrapper processing Video sequence used for qualitative analysis MSE and PLEP for all sequences under study, varying the loss position Detail of MSE and PLEP for all sequences under study MSE vs PLEP (log scale) and linear fit % of different macroblocks vs PLEP and linear fit % of different macroblocks and PLEP for all sequences under study, varying the loss position Results of the subjective assessment for Video Loss impairments Detailed results for each of the individual segments for Video Loss Waveform of a lossy audio file Effect of audio losses: measured vs. expected Short-length audio losses Results of the subjective assessment for Audio Loss impairments Detailed results for each of the individual segments for Audio Loss Results of TI and Contrast NR metrics Results of the subjective assessment for Rate Drop impairments Detailed results for each of the individual segments for Rate Drop Results of the subjective assessment for Outage impairments Detailed results for each of the individual segments for Outage Simplified transmission chain for real-time video Decoding delay for video and audio components of a MPEG-2 Transport Stream Results for all the QuIDs mentioned in the chapter xix
20 xx LIST OF FIGURES 5.1 Example of the packet priority model Implementation of the prioritization model Effect of the window size in packet prioritization results Values of MSE comparing random vs. priority-based packet loss Effect of varying the loss burst size Contribution of each term to the prioritization equation Effects of a limited bit budget to encode the priority Priority-based HTTP Adaptive Streaming segment structure A.1 Structure of the content streams in the subjective assessment test session 132 A.2 Summary of the subjective quality assessment test results A.3 Subjective MOS for a football sequence
21 List of Tables 2.1 ACR and DCR evaluation scales Priority values used in the RTP header extension Coefficient of determination (R 2 ) of MSE vs PLEP fit for several video sequences PLEP impairments analyzed in the subjective assessment tests Audio losses analyzed in the subjective assessment tests Comparison of NR/RR results with subjective tests Quality drops analyzed in the subjective assessment tests Outage events analyzed in the subjective assessment tests Example Channel Change time ranges and their mapping to QoE Priority value for each slice type Values of the Aggregated Gain Ratio Bit budget assignation to encode priority Minimum scrambling rate required to completely loss the video signal A.1 Video test sequences: bitrate and resolution A.2 Bitrate drops A.3 Frame rate drops A.4 Audio losses A.5 Macroblocking errors A.6 Video freezing A.7 Impairment sets A.8 Example of a sequence of impairments A.9 Test sequences xxi
23 Abbreviations 3G Third generation of mobile communication technology ACR Absolute Category Rating AL-FEC Application Layer Forward Error Correction ARQ Automatic Repeat request AVC Advanced Video Coding (also H.264 or MPEG-4 part 10) CA Conditional Access CABAC Context-Adaptive Binary Arithmetic Coding CBR Constant Bit Rate CDN Content Delivery Network CoD Content on Demand DCR Degradation Category Rating DRM Digital Rights Management DSL Digital Subscriber Line DTS DecodingTime Stamp DVB Digital Video Broadcasting FCC Fast Channel Change FEC Forward Error Correction FR Full Reference GOP Group Of Pictures GPON Gigabit-capable Passive Optical Network HAS HTTP Adaptive Streaming HDS HTTP Dynamic Streaming HLS HTTP Live Streaming HNED Home Network End Device HTTP Hypertext Transfer Protocol xxiii
24 xxiv ABBREVIATIONS IDR IP IPTV ITU LMB LTE MDI MOS MPEG MSE MVC NAL NR OTT PCR PLEP PLP PLR PSNR PTS QoE QoS QuEM QuID RAP RET RGW RR RTP SS STF TCP UDP Instantaneous Decoding Refresh Internet Protocol Television over Internet Protocol International Telecommunication Union Live Media Broadcast Long Term Evolution Media Delivery Index Mean Opinion Score Moving Picture Experts Group Mean Square Error Multi-view Video Coding Network Abstraction Layer No Reference Over The Top multimedia delivery services Program Clock Reference Packet Loss Effect Prediction metric Packet Loss Pattern Packet Loss Rate Peak Signal to Loss Ratio Presentation Time Stamp Quality of Experience Quality of Service Qualitative Experience Monitoring Quality Impairment Detector Random Access Point RETranmsission (synonym of ARQ) Residential Gateway Reduced Reference Real-Time Transport Protocol Smooth Streaming Severity Transfer Function Transmission Control Protocol User Datagram Protocol
25 ABBREVIATIONS xxv VBR VQEG Variable Bit Rate Video Quality Experts Group
27 To the loving memory of Juan To Teresa
29 Chapter 1 Introduction 1.1 Motivation There is little doubt about the social relevance of the audiovisual delivery services since the beginning of the first television broadcasts. During the second half of the 20th century, broadcast television channels controlled the audiovisual market and were the main communication path for information, culture, and entertainment. But in the last decades, though the traditional broadcasters are still quite relevant players in the content marketplace, their offer has been complemented by a plethora of new services: IP television, video on demand, web video portals, user-generated content... The way in which contents are consumed is rapidly changing, and there are two technological drivers which have made this possible: digital video and IP networks. With the standardization of MPEG video in the 1990s, it became possible to consume video products at home with high quality and at an affordable cost. The popularization of the internet, at about the same time, brought the possibility to easily interconnect any two points in the world. The combination of both events allowed that video contents could be managed, stored, and distributed homogeneously with the rest of the information. Somehow, the distribution of video to the households had just become a problem of digital data communication and storage. And the main problem to solve was, consequently, finding enough bandwidth to fit the transmission requirements of video assets. The first decade of the 21st century witnessed a quantitative change which resulted in a qualitative jump: improvements in video codec technologies and in the capacity of the xdsl access networks allowed to distribute real time video over IP networks with a quality that could compete with that of television and DVDs. This gave birth to 1
30 2 Chapter 1. Introduction the television over IP (IPTV), which introduced real interactivity and personalization into the audiovisual ecosystem. And in few years time, with subsequent generations of technological improvements, it has been possible to obtain a competing service of video distribution even over the standard best-effort internet, in what has been called over-the-top video delivery (OTT). This has significantly reduced the barriers to entry the multimedia business. And, as this happens, new services are appearing beyond the classic television channels, covering from huge video-clubs over the internet to the distribution of personalized, or even user-generated, video content. Together with the evolution of the services, it comes the problem of how to provide them with enough quality for the end users. The transmission of high quality video can be demanding for the capabilities of IP networks, especially in the access segment. Errors happen, and service providers struggle to have them under control. The monitoring of Quality of Service (QoS) parameters, such as bit rate, packet loss rate, or delay, is not straightforward when the service is distributed over a complex IP network topology. And even when a suitable QoS monitoring system has been set up in the delivery service network, it shows insufficient. The interesting concept to monitor is not strictly the QoS, but the QoE: the Quality of Experience perceived by the final customer. There has been an important effort in the last decade to characterize the perceived quality of an audiovisual content, as well as to find algorithms able to model it. A first method is using subjective quality assessment tests, where a panel of viewers evaluate the perceived quality of the video clips under study. This can provide quite accurate information about video quality and user preferences, but at the high cost of having a group of users involved in the assessment. The complementary approach is developing objective quality metrics: algorithms which try to emulate the responses of those viewers by computer analysis of the video sequences. It has been a very active field of research, especially during the last decade. Dozens of algorithms have been developed, from simple measures of mean square errors between images, up to complex metrics which include information about the Human Visual System (HVS) perception and about the visual structure of the impairments introduced in the video by the coding and transmission chain. However, few of those methods have impacted the market relevantly. There are commercially available quality probes which implement this kind of algorithms, but they are typically used just to measure the quality of the video compression process, and not always in real time. For the monitoring of the quality in the distribution and access network, only network-based measures are used: packet losses, router failures... Moreover, in the recent years, the manufacturers of measurement equipment seem to have reduced the efforts to introduce these complex metrics in their equipments.
31 Chapter 1. Introduction 3 There are good reasons for that. Video QoE metrics are complex to develop and expensive to deploy in the field. They also cover a very specialized field of interest, frequently critical in the video headend and video production departments, but much rarer in the service definition and in the network operation. In many cases the teams operating the network already have an overwhelming amount of QoS data which is hardly possible to manage; so that there is little use of increasing the complexity of this information. Besides, monitoring algorithms need to be implemented in heavily-loaded routers or low-processing user terminals, thus requiring to be extremely lightweight in processing power needs, what may disqualify a large number of the metrics available in the literature. Finally, some metrics are even impossible to apply due to the unavailability of the information at the monitoring point, as it is the case, for instance, when parts of the video stream are encrypted by digital rights management (DRM) or conditional access (CA) systems. In summary, service providers are still using mainly QoS metrics to monitor their networks, but it happens because they are the ones which are applicably under the budgetary, computing, and information availability restrictions that they have to cope with. There is still room for improvement. And this thesis wants to be a step in this direction, trying to reduce the gap between QoE expertise and multimedia delivery service providers. The focus of the work is precisely analyzing how to model, monitor and manage the Quality of Experience under the mentioned restrictions. The research of the thesis has been carried out along the last 8 years in the framework of the Grupo de Tratamiento de Imágenes research group at Universidad Politécnica de Madrid, in parallel to a professional career in the multimedia competence center of Alcatel-Lucent in Madrid. In this time, services, products, research areas and standardization efforts have evolved significantly. During the first years of the research, the line that we are proposing in this thesis was almost inexistent in the most relevant journals, save for a couple of remarkable exceptions. In the recent years, however, there has been an increasing interest in the research and standardization of monitoring strategies which are easier to apply in real operation environments. 1.2 Overview The aim of this thesis is providing architecture, models and results which make it possible for multimedia service providers to control the Quality of Experience offered by their service in a way which is relevant for their interests, practical and better than QoSonly monitoring schemes. It intends to answer the most frequently asked questions that
32 4 Chapter 1. Introduction a service provider can raise about the QoE it is offering: which elements determine the quality of the multimedia stream, which are the most relevant impairments in the perceived quality, what causes them, and how can they be monitored, prevented, and minimized. The thesis proposes a comprehensive strategy to address this problem as a whole, as well as detailed solutions for most of its elements. Part of the inputs taken to create the approach presented in this thesis have come from the day-by-day experience of assessing IPTV service providers, designing solutions for them, and developing products for the content delivery market. All the assumptions taken in the development of the thesis will be supported either by the work itself or by previous works published in the scientific literature. However, broader decisions, such as the relevance of the problem to study or the general approach to it, are influenced by the experience of listening to the customer, capturing their requirements and understanding the advantages and disadvantages of different measurement schemes from a service provider point of view. This fact has no effect on the scientific quality of this work, but it may help understand its underlying motivation. As a consequence, the work is probably biased towards this application-oriented approach in two different ways. On the one hand, there is a stronger focus on the ideas and concepts, rather than on training of mathematical models or extensive analysis of experimental results. As it is virtually unaddressable to simulate the conditions of work of any possible service provider in the world, the research has been aimed at building models which have as less dependency as possible on the context where they are applied, or that can be easily adapted to any specific deployment. In a word: clean and generic models have been preferred to trained and optimized ones. On the other hand, there has been an explicit effort to be sure that any architecture or algorithm proposed in this thesis can be directly applied to real multimedia delivery services. And, in fact, some of them have already been included in products which are currently deployed in the field. The study starts by analyzing several aspects of the state of the art (Chapter 2). It defines what a multimedia delivery service is, which technologies it implies, and which are the most relevant problems to its quality. Although the market applicability of the multimedia services is quite wide, its underlying technological problem is much more restricted. The existing techniques to model, analyze, and monitor the multimedia quality are covered, with special focus on their applicability to content delivery services, and including the published studies which support or formalize the knowledge obtained by work experience. Chapter 3 contains guidelines to design a multimedia delivery service which takes into account the Quality of Experience. It describes a reference architecture model for the service with some QoE-specific elements. It also proposes an specific design for a monitoring