Investigating Wavelet Based Video Conferencing System Team Members: o AhtshamAli Ali o Adnan Ahmed (in Newzealand for grad studies) o Adil Nazir (starting MS at LUMS now) o Waseem Khan o Farah Parvaiz o Muhammad Hussnain (pursuing MS from Sweden) o Ahmed Majeed, Farooq Ali (are in USA for grad studies) Supervisors: o Dr. Shahid Masud o Dr. Nadeem A.Khan Lahore University of Management Sciences 1
Investigating Wavelet Based Video Conferencing We are presenting a wavelet based video conferencing system. Openphone Peer to peer video conferencing system Dirac Wavelet based video codec 2
Outline Introduction Video Conferencing system Openphone Wavelet based video codec Dirac Embedding Process Improvements in Openphone Improvements in Dirac Scalability Summary 3
Video Conferencing System Overview 4
Open Phone in Video Conferencing System What is Openphone? PWLIB SIP RTP 5
Outline Introduction Video Conferencing system Openphone Wavelet based video codec Dirac Embedding Process Improvements in Openphone Improvements in Dirac Scalability Summary 6
Dirac: Wavelet Based Video Codec What is Dirac? Open Source technology High Performance, simple and Modular Design Video Codec Support Multiple Picture Formats Good Subjective quality due to Psycho Visual property of wavelet 7
Dirac Encoder Architecture 8
Outline Introduction Video Conferencing system Openphone Wavelet based video codec Dirac Embedding Process Improvements in Openphone Improvements in Dirac Scalability Summary 9
Dirac Integration In Openphone Dirac version 0.9.1 has successfully embedded in Openphone. Methods Source Code Integration Dirac Library Plug-in Support in Openphone Registration of Dirac in Openphone Dirac Bridging Classes 10
Outline Introduction Video Conferencing system Openphone Wavelet based video codec Dirac Embedding Process Improvements in Openphone Improvements in Dirac Scalability Summary 11
Buffer I/O Sender Side Take video from thecamera Encodes it using the Dirac encoder and directs the output bit-stream toabuffer. The output bit-stream buffer is then handled by Openphone Transmits the bit-stream over network in order to send it tothethe receiver side. Receiver Side Openphone receives thebit-stream, Decodes it using Dirac decoder and Displays the video onthe local display window. 12
Parameter passing from Openphone We have modified Dirac encoder / decoder to get the parameter list from Openphone instead of acquiring it through console input. Width Height Frame rate Bit-rate Video format Qualityetc. whereas Dirac encoder/decoder uses this list to encode/decode as appropriate. 13
Bit-stream Fragmentation Divide the bit-stream into network supported size packets. 14
Size Up gradation (CIF) We have added support for CIF size (352x288) as well as for QCIF (176x144) size. Practical issues involved were increase in encoding time and size of bit-stream. Frame rate reduced to 2,3 fps 15
Frame Rate Optimization Dirac usually generates a bit-stream of 5 to 75 kilobytes modified the Openphone architecture to transmit the whole bit-stream of one frame before grabbing the next frame. Increased the frame rate to 12 fps. 16
INTER Mode Motion Estimation Support Bit-stream size reduction Maintaining the starting and ending sequence of the frame Maintaining the initialization and re-initialization of Dirac encoder. Also, some buffers are introduced for maintaining the bit-stream before transmission. 17
GUI Support The user can configure video conferencing options as desired Dial / Listen Video Size CIF or QCIF Quality Encoding mode (INTER / INTRA) 18
GUI Main Window 19
Crashing Problem Video conferencing lasted only a couple of hours Memory leakages and bugs have been removed from the code by extensive debugging. Tested for 50 hours. 20
Discrete Wavelet Transform Old DWT filter DD9_7 10 frames per second Experimentations have been done over all the available wavelet filters Replaced with HAAR0 Performance gain 25 % Frame rate increased to 15 frames per second 21
Dirac 1.0.2 Latest release of Dirac video codec version 1.0.2 has been successfully implemented in Openphone. Frame rate is around 15 frames per second 22
Client / Server Configurations Quality and Size To be set from both sides Now only Dialer has the option and whatever he selects becomes the communication size for video conferencing. 23
Experiments and Results (Cont.) Comparison of Dirac and other video codec Comparison of Openphone Codecs 50 Average PSNR (db) 45 40 35 30 25 20 50 100 150 200 250 300 350 400 450 500 Bitrate H263-Intra H261-Intra Dirac-Intra Bit rate Vs PSNR of Openphone Codec 24
Experiments and Results (Cont.) Subjective Quality Difference in H.261,H.263 and Dirac at 300 Kbps H.261 H.263 Dirac 25
Outline Introduction Video Conferencing system Openphone Wavelet based video codec Dirac Embedding Process Improvements in Openphone Improvements in Dirac Scalability Summary 26
Improvements in Dirac Improvement in Motion Estimation 3D Recursive Search Algorithm Reduced Computational Complexity Performance Scalability Three Different Bit Streams Base Layer, Enhancement Layer1, Enhancement Layer2 Three Types SNR, Spatial, Temporal 27
Improvements in Motion Estimation Existing Motion Estimation Algorithm: Three Stages Motion vectors are found for each block of 1 pixel accuracy using Hierarchical Motion Estimation Vectors are refined to sub pixel accuracy Mode Decision 28
Continued Process of down conversion in hierarchical motion estimation 29
Continued In lowest level, two candidate vector lists are generated, one is centered at zero motion vector and the other spatially predicted motion vector. In all other levels, there candidate vector lists are generated, two above and one guide vector list,which is the best motion vector of the block at the immediate lower level. Sum of absolute difference is used as a cost function. 30
Continued Complexity of Existing Algorithm Using Hierarchical Motion Estimation Huge amount of SAD calculations, taking 80% of total encoding time 3DRS Algorithm Spatial Predictor Temporal Predictor A C X B E MB in current frame MB in previous frame Current MB 31
Continued Proposed ME Scheme Based on 3DRS 32
Experiments and Results The Comparison of Motion Estimation (ME) Results for CIF with GOP Size = 36, B Frames = 18 Sequence Algorithm File Size (KB) PSNR-Y (db) Avg(SAD) /Frame %SAD Reduction of 3DRS %ME Time Reduction of 3DRS %Total Time Reduction of 3DRS Original 1316 37.9834 122871 Foreman 3DRS 1281 37.7774 47285 62% 61% 49% Container Original 825 35.8540 142651 3DRS 769 35.5432 58371 59% 59% 46% News Highway Original 791 39.8091 76566 3DRS 781 39.7951 27965 Original 7231 39.2548 107357 3DRS 6732 39.0925 46962 63% 60% 43% 57% 55% 42% 33
Experiments and Results The Comparison of Motion Estimation (ME) Results for CIF with GOP Size = 54, B Frames = 36 Sequence Algorithm File Size (KB) PSNR-Y (db) Avg(SAD) /Frame %SAD Reduction of 3DRS %ME Time Reduction of 3DRS %Total Time Reduction of 3DRS Original 1129 37.3678 126498 Foreman 3DRS 1075 37.1229 48095 62% 62% 49% Container News Highway Original 697 37.7331 79323 3DRS 635 37.7676 30872 Original 681 39.4961 78479 3DRS 656 39.4300 28453 Original 6050 38.7711 110046 3DRS 4692 38.5662 48299 61% 59% 43% 64% 60% 44% 56% 57% 43% 34
Conclusion Implemented modified adaptive 3DRS algorithm by using temporal and spatial block motion vectors. Reduces the average number of SAD calculations per block by 50% to 65%. Reduces the motion estimation time from 50% to 60%. Reduces the total encoding time from 30% to 50%. Average PSNR and bit stream size does not show significant variations. 35
Outline Introduction Video Conferencing system Openphone Wavelet based video codec Dirac Embedding Process Improvements in Openphone Improvements in Dirac Scalability Summary 36
Scalability Three different bit-streams Base Layer (7-16) Enhancement Layer 1 (4-6) Enhancement Layer 2 (1-3) Three types SNR Spatial Temporal ( In progress) 37
Dirac Bit Stream Syntax 38
Dirac Bit Stream Syntax 39
Dirac Architecture For Scalability 40
Dirac Bit stream Splitter 41
Bit Stream Layers After Splitting Bit Stream Layers After Splitting for Inter Frame 42
Dirac Bit stream Joiner 43
Bit Stream After joining Bit Stream After Joining for Inter Frame 44
Scalability In Openphone Three different bit-streams Base Layer (7-16) Enhancement Layer 1 (4-6) Enhancement Layer 2 (1-3 ) Three types Spatial SNR Temporal (Not completed) 45
Scalability in Openphone User receive video depending on his capabilities. 46
Outline Introduction Video Conferencing system Openphone Wavelet based video codec Dirac Embedding Process Improvements in Openphone Improvements in Dirac Scalability Summary 47
Tasks Completed Understanding the code of different modules of Dirac video codec Documenting the algorithmic details Improvement in encoding time of Dirac 3DRS implementation, Comp complexity reduction, wavelet transform improvement Implementing SNR and Spatial scalability in Dirac Integrating Dirac in Open phone Implementing Multi-streaming in open phone Providing Graphical User Interface support Support up till CIF format Improvement Dirac to real time Extended frame rate from 3 to 15 48
Ongoing and Future Work Addressing other modules regarding performance perspective Implementing Wavelet coefficient parent child relationship for scalable video performance improvement Implementing Temporal scalability Support extended for HDTV. Multi-user support for scalability in open phone Association of audio stream Improvement of GUI according to new developments Chat, File sharing 49