Power Benefits Using Intel Quick Sync Video H.264 Codec With Sorenson Squeeze Whitepaper December 2012 Anita Banerjee
Contents Introduction... 3 Sorenson Squeeze... 4 Intel QSV H.264... 5 Power Performance... 6 Case Study 1: Compressing to ipad-apple_tv_1080p format.... 6 The Overall Measurements Data:... 10 Case Study 2: Encoding into 3 Different Formats Simultaneously - Parallel Transcoding.... 11 The Overall Measurements Data:... 15 Conclusion... 16 Tools and Methodologies... 17 Table of Figures Figure 1: Energy consumed when system is connected to AC power supply... 7 Figure 2: Energy consumed when System is running in battery... 7 Figure 3: Power consumption comparison for two codecs to encode the Buck Bunny file when system is connected to AC power supply.... 8 Figure 4: Power consumption comparison for two codecs to encode the Buck Bunny file when system is running in Battery.... 8 Figure 5: GPU usage for Intel QSV H.264 codec... 9 Figure 6: Parallel encoding when using Intel QSV H.264 codec... 12 Figure 7: Energy consumed when system is connected to AC power supply... 13 Figure 8: Energy consumed when System is running in battery... 13 Figure 9: Power consumption by 2 different codecs to encode the same file into 3 different formats when system is connected to AC power supply.... 14 Figure 10: Power consumption by 2 different codecs to encode the same file into 3 different formats when system is running in Battery.... 14 Table of Tables Table 1: Measurements when system is running in AC power supply... 10 Table 2: Measurements when system is running in battery... 10 Table 3: Measurements when system is running in AC power supply... 15 Table 4: Measurements when system is running in battery... 15 2 P a g e
Introduction Ever since mobile devices ranging from Notebooks, Netbooks, to handheld devices such as Tablets, Smartphones, PDAs etc. were introduced, the requirements of conserving power have steadily increased. As people adopt these devices, making them part of their daily lives, the power factor has become even more critical than ever before. With increased demand of application performance, features, services and network reliability, which comprises to the more power consumption, conserving battery life has also become very important. This whitepaper describes how that very power and thus battery life is optimized when you use Intel Quick Sync Video (QSV) H.264 codec. This study has been conducted using Sorenson Squeeze Pro* 8.5.0.41. Intel QSV H.264 is examined against the software codec, such as Sorenson MPEG-4 codec, for the purpose of comparisons and analysis of power benefits. The System: HP EliteBook* 8470p, Intel Core i7-3720qm @2.60GHz. Intel HD Graphics 4000. 8 GB RAM, Microsoft Windows 7* Professional SP1 64bit. Graphic Driver Version: 15.28.7.64.2867 (9.17.10.2867). The Video File: A 1080p AVI file (big_buck_bunny_1080p_stereo.avi) File Size = 697,993KB, Length = 9 min 56 sec, Video Bitrate = 9586 kbps, Audio Bitrate = 245 kbps, Frame Rate = 24 Frames/Sec This video file is encoded into different formats to observe various power benefits. The formats used are described in detail in the Power Performance section. 3 P a g e
Sorenson Squeeze Sorenson Squeeze is an easy to use software suite for video compression. It encodes to multiple formats including MPEG4, Windows Media Video*, QuickTime*, Adobe Flash Video*, MPEG-1&2, and WebM. It also supports parallel transcoding feature, which we will use for our analysis as well (Case Study #2). The newest version, Squeeze 8.5 offers: Benchmark encoding speeds averaging 200% faster than Squeeze 8, including: o Single Output Acceleration for MP4, MOV, MKV, WebM o Adaptive Bitrate Acceleration o Intel Quick Sync Video (QSV) H.264 codec options New Review & Approval capability in the Cloud through Squeeze s enhanced workflow and collaboration tools, including up to 5 GB of Free, Permanent Storage for videos. CPU Load Controls MPEG DASH encoding Manual Timecode Controls JPEG2000 Decode (MXF, Elem) We have used Sorenson Squeeze Pro 8.5.0.41 for the tests. For more information visit: http://www.sorensonmedia.com/. 4 P a g e
Intel QSV H.264 Intel QSV H.264 is a video compression decompression component (codec) which uses Intel Quick Sync Video technology. Intel Quick Sync Video technology is Intel s hardware-accelerated video encoding, decoding and preprocessing technology of Intel HD Graphics which is integrated into 2 nd and 3 rd generation Intel Core processors. Intel Quick Sync Video makes fast work of creating, editing, synchronizing, and sharing your videos at home and online. With it you can create DVDs or Blu-ray* discs, create and edit 3D videos, convert 2D video files into 3D, and convert video for your portable media player and for uploading to your favorite social networking sites all in a flash. Intel QSV H.264 codec is implemented using Intel Media SDK APIs that exposes Intel Quick Sync Video and take advantage of offloading video processing work to integrated Intel HD Graphics. Instead of processing the video encoding/decoding only on the CPU, Intel QSV H.264 codec takes advantage by offloading some of the work onto Intel HD Graphics, which runs in parallel with the CPU Core. Intel QSV H.264 gives best of both worlds performance and energy conservation. 5 P a g e
Power Performance To obtain a clear comparison, the system power settings were kept identical across the tests and case studies. The settings for Intel QSV H.264 codec and Sorenson MPEG-4 codec were also kept exactly same across the testing and case studies. Case Study 1: Compressing to ipad-apple_tv*_1080p format. Intel QSV H.264 Settings: 1-Pass VBR, Frame Rate = 1:1 Frames/Sec, Data Rate = 5000kbps, AVC Profile: Baseline, Level:4.0, No B frames, Maintain Aspect Ratio, Key frame every 10 Sec, Auto ClosedGOP and CABAC, Top field first. No filters. Audio: Coding Technologies AAC, 160Kbps, 48000Hz, Stereo channel, Sample Size=16. Sorenson MPEG-4 Settings: 1-Pass VBR, Frame Rate= 1:1 Frames/Sec., Date Rate=5000kbps, No B frames, Maintain Aspect Ratio, 1920x1080, Key frame every 10 Sec, Auto Key frame(50), MPEG4Quantization. No filters. Audio: Coding Technologies AAC, 160Kbps, 48000Hz, Stereo channel, Sample Size=16 The following graphs (Figure 1 and Figure 2) compare and contrast the processor power and battery charge consumption while encoding the video file with identical settings with the two codecs. For the real life use cases, starting from browsing to video playback, monitoring processor power usage is critical as power consumption by other system elements (display, memory, SSD, etc.) remains same with little or no differences across the applications. That is why we particularly monitored the processor power usage and examined how overall battery charge consumption and battery life was impacted. With regards to processor power, this equates to the power consumption by the entire CPU package (CPU (core), GPU (HD graphics) and Intel uncore). These graphs (Figure 1 and Figure 2) are for total processor energy and battery charge consumption for Intel QSV H.264 codec and Sorenson MPEG-4 codec to encode the entire big_buck_bunny_1080p_stereo.avi (hereafter referred as Buck Bunny) file under the settings mentioned above. For better understanding and thorough study, we have analyzed both the situations: when the system is running with an AC power supply connected, as well as, when the system is running in battery. Intel Power Gadget and Intel Battery Life Analyzer (BLA) (described in Tools and Methodologies section below) are used for collecting these data. 6 P a g e
5000 4000 3000 2000 1000 0 Time Needed (sec) Processor Energy Battery Charge Intel QSV H.264 codec Sorenson Codec Figure 1: Energy consumed when system is connected to AC power supply 6000 5000 4000 3000 2000 1000 0 Time Needed (sec) Processor Energy Battery Charge Intel QSV H.264 codec Sorenson Codec Figure 2: Energy consumed when System is running in battery The following graphs (Figure 3 and Figure 4) describe the detail power usage for each case. These analyses are conducted using Intel Power Gadget tool. The details of the tool are given at Tools and Methodologies section. 7 P a g e
Y - Power (Watts) Y - Power (Watts) W h i t e p a p e r 60 50 40 30 20 10 0 0 50 100 150 200 250 300 350 400 450 500 X - Elapsed Time (Seconds) Processor Power Usage: Intel QSV codec (Watt) Processor Power Usage: Sorenson codec (Watt) Figure 3: Power consumption comparison for two codecs to encode the Buck Bunny file when system is connected to AC power supply. 35 30 25 20 15 10 5 0 0 100 200 300 400 500 600 700 X - Elapsed Time (Seconds) Processor Power Usage: Intel QSV codec (Watt) Processor Power Usage: Sorenson codec (Watt) Figure 4: Power consumption comparison for two codecs to encode the Buck Bunny file when system is running in Battery. The former graphs depict that the total power consumption for encoding the file using Intel QSV H.264 codec is much less than the Sorenson MPEG-4 codec. When Sorenson Squeeze uses Sorenson MPEG-4 codec, the power consumption remains steadily high for the entire encoding time which is in the range of ~400-600 Sec, whereas when Sorenson Squeeze uses Intel QSV H.264 codec, the total power consumption got optimized for completing the encoding work, which is in range of ~150-180 Sec. 8 P a g e
Intel HD Graphics usage can be measured using Intel Graphics Performance Analyzers (GPA) tool. The GPU tab shows the overall usage percentage of Intel HD Graphics. This tool is described in detail in Tools and Methodologies section. GPA shows that GPU didn t get used at all in case of Sorenson MPEG-4 codec. With Intel QSV H.264 codec the GPU usage is displayed as shown below (Figure 5): Figure 5: GPU usage for Intel QSV H.264 codec This is another way to see how Intel QSV H.264 codec has offloaded the task to Intel HD Graphics and thus saves both energy and time. Sorenson Squeeze is optimized to use all system components to maximum. It runs parallel decode/preprocessing/encode sessions to fully utilize all threads in CPU. In case of Intel QSV H.264 codec, as most of the transcoding work is offloaded to Intel HD Graphics, it carries on parallel work with available CPU bandwidth and thus saves both time and energy. 9 P a g e
The Overall Measurements Data: Codec Time Needed (sec) Processor Energy Battery Charge Charge (%) Frames Encoded Frames encoded per second Intel QSV 144.89 1664.6 555 1.08 14315 98.79 H.264 Sorenson 437.24 4603.6 1898 3.45 14315 32.73 MPEG-4 Table 1: Measurements when system is running in AC power supply Codec Time Needed (sec) Processor Energy Battery Charge Charge (%) Frames Encoded Frames encoded per second Intel QSV 169.45 1183.73 2031 3.73 14315 84.47 H.264 Sorenson 570.45 3362.36 5428 9.24 14315 25.09 MPEG-4 Table 2: Measurements when system is running in battery 10 P a g e
Case Study 2: Encoding into 3 Different Formats Simultaneously - Parallel Transcoding. As we mentioned in the prior Sorenson section, parallel transcoding is an important feature of Sorenson Squeeze, and this study examines the power benefits for that situation. For this study, we encoded the video file big_buck_bunny_1080p_stereo.avi into 3 different video formats: ipad-apple_tv_1080p, Blu-ray_29.97_1080i and YouTube_1080p simultaneously using both the codecs in our question. The settings of the codecs were as follows: Intel QSV H.264 ipad-apple_tv_1080p Settings : 1-Pass VBR, Frame Rate= 1:1 Frames/Sec, Date Rate=5000kbps, AVC Profile: Baseline, Level:4.0, No B frames, Maintain Aspect Ratio, 1920x1080, Key frame every 10 Sec, ClosedGOP and CABAC:auto, top field first. No filters. Audio: Coding Technologies AAC, 160Kbps, 48000Hz, Stereo channel, Sample Size=16. Blu-ray_29.97_1080i Settings : 1-Pass VBR,, Frame Rate= 1:1 Frames/Sec, Date Rate=5000kbps, AVC Profile: Baseline, Level:4.0, No B frames, Maintain Aspect Ratio, 1920x1080, Key frame every 10 Sec, ClosedGOP and CABAC:auto, top field first. No filters. Audio: Coding Technologies AAC, 64Kbps, 44100Hz, Stereo channel, Sample Size=16. YouTube_1080p Settings : 1-Pass VBR, Frame Rate= 1:1 Frames/Sec, Date Rate=5000kbps, AVC Profile: Baseline, Level:4.0, No B frames, Maintain Aspect Ratio, 1920x1080, Key frame every 10 Sec, ClosedGOP and CABAC:auto, top field first. No filters. Audio: Coding Technologies AAC, 256Kbps, 44100Hz, Stereo channel, Sample Size=16. Sorenson MPEG-4 ipad-apple_tv_1080p Settings : 1-Pass VBR, Frame Rate= 1:1 Frames/Sec, Date Rate=5000kbps, No B frames, Maintain Aspect Ratio, 1920x1080, Key frame every 10 Sec, Auto Key frame(50), MPEG4Quantization. No filters. Audio: Coding Technologies AAC, 160Kbps, 48000Hz, Stereo channel, Sample Size=16. Blu-ray_29.97_1080i Settings : 1-Pass VBR, Frame Rate= 1:1 Frames/Sec, Date Rate=5000kbps, No B frames, Maintain Aspect Ratio, 1920x1080, Key frame every 10 Sec, Auto Key frame (50), MPEG4Quantization, No filters. Audio: Coding Technologies AAC, 64Kbps, 44100Hz, Stereo channel, Sample Size=16. 11 P a g e
YouTube_1080p Settings : 1-Pass VBR, Frame Rate= 1:1 Frames/Sec, Date Rate=5000kbps, No B frames, Maintain Aspect Ratio, 1920x1080, Key frame every 10 Sec, Auto Key frame(77), MPEG4Quantization, No filters. Audio: Coding Technologies AAC, 256Kbps, 44100Hz, Stereo channel, Sample Size=16. Just like in Case Study 1, the Intel QSV H.264 codec uses Intel Media SDK APIs and Intel Quick Sync Video technology to take advantage of offloading parts of video transcoding to the integrated Intel HD Graphics, which runs in parallel with CPU. In this case study, as Intel QSV H.264 codec achieves parallelism for a large workload (3 encoding tasks), the benefits with this codec are multifold both in terms of speed and energy conservation, compare to other codecs such as Sorenson MPEG-4. Parallelism achieved at encoding can be observed in the following figure: Figure 6: Parallel encoding when using Intel QSV H.264 codec Similar to Case Study 1, we analyzed the processor power and battery charge consumption for both the codecs. As explained in the first case study, processor power is the most important thing to watch, and we particularly monitored its usage in our study, observing how overall battery charge consumption and battery life was impacted. By processor power, again we mean the power consumption by CPU (core), GPU (HD graphics) and Intel uncore as well. These graphs (Figure 7 and Figure 8) depict the total energy and battery charge consumed for Intel QSV H.264 codec and Sorenson MPEG-4 codec for encoding Buck Bunny into 3 formats in parallel. 12 P a g e
For the thorough study, we analyzed the encoding process under both the circumstances: When system is running with AC power supply connected, as well as when system is running only in battery. 16000 14000 12000 10000 8000 6000 4000 2000 0 Time Needed (sec) Processor Energy Battery Charge Intel QSV H.264 codec Sorenson Codec Figure 7: Energy consumed when system is connected to AC power supply 18000 16000 14000 12000 10000 8000 6000 4000 2000 0 Time Needed (sec) Processor Energy Battery Charge Intel QSV H.264 codec Sorenson Codec Figure 8: Energy consumed when System is running in battery 13 P a g e
Y - Power (Watts) Y - Power (Watts) W h i t e p a p e r Figure 9 and Figure 10 illustrate the differences in consumed energy when encoding the big_buck_bunny_1080p_stereo.avi into 3 different formats simultaneously using Intel QSV H.264 codec and Sorenson MPEG-4 codec: 60 50 40 30 20 10 0 0 200 400 600 800 1000 1200 1400 1600 X - Elapsed Time (Seconds) Processor Power Usage: Intel QSV codec (Watt) Processor Power Usage: Sorenson codec (Watt) Figure 9: Power consumption by 2 different codecs to encode the same file into 3 different formats when system is connected to AC power supply. 35 30 25 20 15 10 5 0 0 500 1000 1500 2000 2500 X - Elapsed Time (Seconds) Processor Power Usage: Intel QSV codec (Watt) Processor Power Usage: Sorenson codec (Watt) Figure 10: Power consumption by 2 different codecs to encode the same file into 3 different formats when system is running in Battery. 14 P a g e
The Overall Measurements Data: Codec Time Needed (sec) Processor Energy Battery Charge Battery Charge (%) Frames Encoded Frames encoded per second Intel 437.98 4979.89 1887 6.27306 42945 98.05 QSV H.264 Sorenson 1351.03 14266.79 8081 15.4112 42945 31.78 MPEG-4 Table 3: Measurements when system is running in AC power supply Codec Time Needed (sec) Processor Energy Battery Charge Battery Charge (%) Frames Encoded Frames encoded per second Intel 612.72 3721.34 6760 18.3436 42945 70.08 QSV H.264 Sorenson 1905.01 10667.25 17849 29.2578 42945 22.54 MPEG-4 Table 4: Measurements when system is running in battery 15 P a g e
Conclusion Examining the system power and battery energy consumption at different stages described in this paper allows you to see how Intel QSV H.264 provides power efficiency as well as performance benefits and system responsiveness over the software codec, such as Sorenson MPEG-4 codec. The overall gain: For a single encoding (Case Study 1): 3+ times frame rate. 2.7 times processor energy savings. 3.4 times (when system running in AC power supply) and 2.6 times (when system running in battery) battery energy savings. For parallel transcoding (Case study 2): 3+ times frame rate. 2.8 times processor energy savings. 4 times (when system running in AC power supply) and 2.6 times (when system running in battery) battery energy savings. Media accelerations powered by Intel Quick Sync Video technology allows extremely power-efficient video processing while still offering programmable flexibility and high throughput. About the Author Anita Banerjee is a Media Performance Engineer in Intel's Developer Relations Division. Anita is focused on optimizing the performance and quality of PC multimedia software for Intel architecture. Anita has developed various device drivers, OS kernels and application software and worked at ATI, Motorola, nvidia and Océ-Canon group prior to joining Intel in 2012. 16 P a g e
Tools and Methodologies Intel Graphics Performance Analyzers (GPA): The Intel Graphics Performance Analyzers 2012 (Intel GPA) is a suite of graphics analysis and optimization tools to help game developers make games and other graphics-intensive applications run even faster on Intel Core and Intel Atom processor based platforms. Intel GPA consists of three powerful analysis and optimization tools: Intel GPA System Analyzer Intel GPA Frame Analyzer Intel GPA Platform Analyzer For the studies in this white paper, Intel GPA (version - 2012 R4) was used to measure the Intel HD Graphics (GPU) utilizations at encoding time. Intel Graphics Performance Analyzers (GPA) is available at: http://software.intel.com/en-us/vcsource/tools/intel-gpa Intel Power Gadget 2.0: Intel Power Gadget 2.0 is a software-based power estimation tool enabled for 2nd Generation Intel Core processors. It includes a Microsoft Windows* sidebar gadget, driver, and libraries to monitor and estimate real-time processor package power information in watts using the energy counters in the processor. With this release, we are providing functionality to evaluate power information on various platforms including desktops/notebooks and servers. Intel Power Gadget 2.0 was used to measure the processor, IA and GT energy consumed during the encoding time. Intel Power Gadget 2.0 is available at: http://software.intel.com/en-us/articles/intel-power-gadget/ Intel Battery Life Analyzer (BLA): The Intel Battery Life Analyzer (BLA) is a tool that monitors various software and hardware activities that affect battery life. We used the Intel Battery Life Analyzer (BLA) 2.3.0.1041 to measure the battery energy consumption during the encoding time. Intel Battery Life Analyzer (BLA) is available at: http://downloadcenter.intel.com/detail_desc.aspx?agr=y&dwnldid=19351 17 P a g e
Notices INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. A "Mission Critical Application" is any application in which failure of the Intel Product could result, directly or indirectly, in personal injury or death. SHOULD YOU PURCHASE OR USE INTEL'S PRODUCTS FOR ANY SUCH MISSION CRITICAL APPLICATION, YOU SHALL INDEMNIFY AND HOLD INTEL AND ITS SUBSIDIARIES, SUBCONTRACTORS AND AFFILIATES, AND THE DIRECTORS, OFFICERS, AND EMPLOYEES OF EACH, HARMLESS AGAINST ALL CLAIMS COSTS, DAMAGES, AND EXPENSES AND REASONABLE ATTORNEYS' FEES ARISING OUT OF, DIRECTLY OR INDIRECTLY, ANY CLAIM OF PRODUCT LIABILITY, PERSONAL INJURY, OR DEATH ARISING IN ANY WAY OUT OF SUCH MISSION CRITICAL APPLICATION, WHETHER OR NOT INTEL OR ITS SUBCONTRACTOR WAS NEGLIGENT IN THE DESIGN, MANUFACTURE, OR WARNING OF THE INTEL PRODUCT OR ANY OF ITS PARTS. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined". Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information. The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or go to: http://www.intel.com/design/literature.htm Intel, the Intel logo, VTune, Cilk and Xeon are trademarks of Intel Corporation in the U.S. and other countries. *Other names and brands may be claimed as the property of others Copyright 2012 Intel Corporation. All rights reserved. 18 P a g e
Optimization Notice http://software.intel.com/en-us/articles/optimization-notice/ Performance Notice For more complete information about performance and benchmark results, visit www.intel.com/benchmarks 19 P a g e