Assessment of VoIP Quality over Internet Backbones Athina Markopoulou & Fouad Tobagi Electrical Engineering Dept. Stanford University Mansour Karam RouteScience Technologies, Inc. INFOCOM 2002, 06/25/02
Motivation The Internet has been successfully used to support data applications. Desirable to support more applications. Voice over (IP) Strict quality and interactivity requirements. Users have high quality expectations. Internet Question: How well does the Internet support voice communications today? 2
Outline VoIP System and Impairments Measurements of ISP backbones Voice Quality Assessment Numerical Results Summary 3
VoIP system and impairments Impairments Speech quality Delay impairments Voice source Compression Packet loss (< 2-5%) Interactivity (<150ms) Echo Quality talk silence Assessment Algorithm Encoder Packetizer Sender talkspurt talkspurt silence Network Depacketizer & Playout buffer Receiver Decoder & concealment 4
Playout Scheduling Playout scheduling Fixed playout Adaptive playout? Send time Receive time Playout time Playout Example algorithms [RKTS94]: Ramjee, Kurose, Towsley, Schulzrinne Estimate delay and variance using moving averages and spike detection Adapt playout time at the beginning of each talkspurt [MKT98]: Moon, Kurose, Towsley Improvements in estimation and spike detection 5
Measurements collection Measurements collected and provided by RouteScience Tech. Inc. Probes sent over the backbone networks of major ISPs P 3 P 6 P 6 AND P 1 P 3 P 4 P 5 P 2 P 6 EWR P 5, P 2 SJC P 1 THR P 1 P 1 ASH P 1 P 4 P 5, P 6 P 1 P 4 P 5 P 6 P 7 5 cities, 7 ISPs, 43 paths, 3 days in June 2001 50B probes sent every 10ms Synchronized using GPS 6
Delay Loss Delay and loss characteristics Fixed part: East coast: 3.25-11.8 ms Coast-Colorado: 28.3-77.8 ms Coast-to-coast: 31.3-47.2 ms Delay variability: Pattern: mainly spikes During the day delay in ms Mainly outages reliability problems height fixed delay width (send) time in sec Last 0.5-2 minutes. Happen at least once per day for 6/7 providers frequency Usually preceding changes in the fixed part. Often happen simultaneously in more than one paths of the same provider Consistent patterns per provider (and per path, time of the day) 7
Example 1: a low variability path Long distance path SJC-P7-ASH (fixed delay = 40.5 ms) No loss except for a 2 minutes outage max delay min, 50%, 99%, 99.9% of delay 8
Example 2: a high variability path Short distance path EWR-P1-ASH (fixed delay= 11.8ms ) Large delay and large delay variability Sustained periods of congestion 9
Example 2 cont d: Outages Loss on the previous path (EWR-P1-ASH, on Wed 06/27/01) 7 outages per day, 20-40 seconds each 10
Example 3: Periodic Pattern Clusters of spikes appearing every ~70sec. They last 3sec and they are as high as 300-500ms. All 6 paths of provider P4 exhibit this pattern all the time 11
Summary of paths Provider 1 Provider 2 Provider 3 Provider 4 Provider 5 Provider 6 Provider 7 Number of paths per delay pattern Very low variability 2 1 2 2 5 2 Loaded paths 6 Occasionally higher delay 4 1 (hours) 4 4 (minutes) 4 (seconds) Periodic Pattern 6 outages Duration (sec) 1-40 1-15 6-20 15-30 1-3 1-12 116,160 Per day 1-10 2-3 1-2 3-5 1 1-2 1 1/3 of paths already have excellent VoIP performancec poor VoIP performance can perform well (playout) 12
VoIP Quality Emodel Rating (R) Speech Transmission Quality (user satisfaction.) Mean Opinion Score (MOS) Desirable Acceptable 100 90 80 70 60 50 Best (very satisfied) High (satisfied) Medium (some users dissatisfied) Low (many users dissatisfied) Poor (nearly all dissatisfied) 4.5 4.3 4.0 3.6 3.1 2.6 Not recommended 0 1 13
Emodel Many studies have assessed individual impairments Emodel combines all impairments into a single rating Impairments are additive in the appropriate psychoacoustic scale : R= (Ro-Is) Id (echo, delay) Ie (codec, loss rate) + A Loss of Interactivity (conversation type) Echo on the path (cancellation) Codec (type: e.g. G.711,concealment) Loss (rate, duration, pattern) Delay Impairment (Id) Distortion impairment (Ie) E-model Rating R R2MOS Other parameters (noise, user tolerance) MOS 14
Loss-Delay trade-off Overall quality depends on both delay and loss Playout scheduling controls both MOS=3.6 MOS=4.0 % of packets q% late loss: n>playout MOS=4.3 Playout increases playout network delay 15
Call Assessment Methodology Time varying impairments and arbitrary loss patterns Partition the trace into good and bad intervals 1% loss [A.Clark, IP Telephony 2001] Transitions are perceived with delay ( recency ) [France Telecom R&D, ITU-T ST.12 contr., 1999] 15% 5% 20% Rating at the end of a call [France Telecom R&D, ITU-T ST12., 2000] [A.Clark, IP Telephony 2001] 16
Example of call assessment Consider a call happening over a 100 sec network trace Consider fixed playout at 120 ms Steps for translating the network trace to call quality: 78% 67% 17
Comparing providers (one hour) Consider all paths from New York to Ashburn at the same time (3-4pm EST, Wed 06/27/01) The quality of a call depends on the provider Fixed playout at 150 ms Fixed playout at 250 ms P4 P4 P1 P8 P9 P1 P8 P9 18
A low variability path EWR-P6-SJC, Wed 06/27/01 Delay in [38, 40] ms and spikes 80-250 ms. Some loss. Fixed playout at 100ms 20 sec congestion 1 minute loss (routing change) Loss < 220 ms 19
A high variability path THR-P1-ASH Min delay = 77.8 ms Max Delay = 120 670 ms Loss negligible An entire day: Wed 06/27/01 Call statistics for 1 hour: 14-15 Adaptive playout fixed 100ms 150ms 200ms 20
Summary We have studied the VoIP quality over Internet backbones using: delay and loss measurements and perceived quality measures We found that: The Internet is capable of carrying VoIP at high quality this is already the case for many ISPs However, today, there are still problems in the backbones: Outages and network control traffic Delay jitter: spikes 21
Assessment of VoIP Quality over Internet Backbones. {amarko, tobagi} @ stanford.edu mans @ routescience.com INFOCOM 2002, 06/25/02