FCE: A Fast Content Expression for Server-based Computing Qiao Li Mentor Graphics Corporation 11 Ridder Park Drive San Jose, CA 95131, U.S.A. Email: qiao li@mentor.com Fei Li Department of Computer Science Columbia University New York, NY 127, U.S.A. Email: lifei@cs.columbia.edu Abstract Server-based computing (SBC) is an approach delivering computational services across the network with advantages of reduced administrative costs and better resource utilization. In SBC, all application processing is done on servers and only screen updates are sent to clients. We introduce a fast content expression (FCE) for screen updates coding. Given a square region of pixel values, FCE constructs a table of unique pixel values in the region and converts each value in the original region into an index into the table. We have implemented our algorithm and compared it with other popular coding methods, including JPEG-LS, JPEG, gif, gzip, VNC hextile, and various combinations. Our results show that our approach provides low coding complexity with reasonable compression. I. INTRODUCTION In recent years, there is a growing trend away from the distributed model of desktop computing toward a more centralized model of server-based computing (SBC). In SBC, all application processing is carried out by a set of shared server machines. Clients connect to the servers for all their computing needs. Since SBC servers maintain the full persistent state of user sessions, the only functionality needed on the client is to be able to send keyboard and mouse input to the server and receive graphical display updates from the server. SBC offers the potential of reducing total cost of computational services through reduced system management cost and better utilization of shared hardware resources. The key enabling technology underlying the SBC approach is the remote display protocol, which enables graphical displays to be served across a network to a client device while applications and even window systems are executed at the server side. A number of SBC encoding techniques have been developed, ranging from higher-level graphics primitives to lowerlevel pixel-based compression techniques. However, existing SBC encoding techniques have been shown to not be effective in supporting the display demands of multimedia applications [4]. In this paper, we present fast content expression (FCE) to encode graphical display content. FCE benefits from the property of temporal and spatial similarity that neighboring pixels are correlated and therefore contain redundant information. Given a square region of pixel values, FCE constructs a table of unique pixel values in the region and represents each value in the original region by an index into the table. As any image can be decomposed into square regions, FCEs of those square regions can represent arbitrary screen update. We have implemented FCE and compared it with other popular image coding methods, including JPEG-LS, JPEG, PNG, GIF, and gzip. Our results show that our approach provides low coding complexity with reasonable compression. This paper is organized as follows. Section II discussed related work in image compression. Section III introduces the FCE expression for representing image content in a square region. Section IV describes how we apply FCE in developing a new image compression algorithm. Section V presents experimental results comparing FCE compression algorithm with other compression methods. Finally, we conclude with Section VI. II. RELATED WORK Previous approaches in encoding screen updates can be classified into two categories, graphics-based and pixel-based. Graphics-based approaches employ a variety of higher-level graphics primitives to represent screen updates in terms of fonts, lines, bitmaps, etc. These approaches are used in systems such as X, Windows Terminal Services [2], Citrix MetaFrame [1], and Tarantella [8]. Despite the range of available encoding primitives, the screen updates associated with multimedia applications such as images and video are typically encoded as raw pixel bitmaps. Little compression is achieved on these screen updates for multimedia applications. Pixel-based approaches are simpler and treat a screen update just as a region of pixels. They are used in systems such as Sun Ray [9] and Virtual Network Computing (VNC) [5]. As these approaches employ similar bitmap encoding primitives for multimedia display workloads, they limit the compression achievable for screen updates. More recently, some SBC pixel-based encoding techniques have been developed that take advantage of the common image characteristics of screen updates. These methods achieve improved compression ratios, but at the cost of higher encoding and decoding complexity. These coding costs limit the utility of these techniques for supporting multimedia applications in SBC environments. Based on image properties, different compression algorithms have different advantages. Existing algorithms with better compression ratio have higher complexity, and thus have limited applications. Hence, a simple, and yet higher compression ratio encoding method is necessary. III. FCE: FAST CONTENT EXPRESSION FCE is a general extension of the hextile compression method in VNC [5] and takes advantages of temporal and spatial correlations for screen updates by introducing a local table of content expressions.
FCE is an expression with varying length to describe screen update content in a square region. Suppose the update square region has s along each edge, and among the s 2 pixel values, there are A totally different values (A s 2 ). If we use a queue to hold all the A different values, then log 2 (A) bits are enough to express the offset i into the queue for any specific value in the update region. Namely each pixel value and its offset i has an 1 1 mapping relation in the square region. Thus FCE can use the offsets for the values in the queue instead of full pixel values. The format of FCE is shown in Fig. 1. Fig. 1. Field 1 Field 2 Field 3 FCE format to represent image content in a square region. Each FCE consists of 3 fields. 1. Field 1 (number of different pixel values): This field contains the value A, which indicates how many different pixel values are in this square region. It helps traversing field 2 to extract all pixel values and implies the boundary of field 2 and field 3 in FCE. The value of A depends on the spatial redundancy of the square region in the image. 2. Field 2 (queue of different pixel values): This field is the queue holding all the A different pixel values. The length of this field depends on the value in the first field (A) and number of bits used to store one full pixel value in the machine. 3. Field 3 (offset for all pixels in scan-line order): For each pixel, it uses the offset of its value in the queue (field 2)to denote its pixel value. As there are A different values, each pixel will use up to log 2 (A) bits for the offset. Field 3 contains all the offsets for all the pixels in the square region, and has a length of s 2 log 2 (A). FCE records these offsets in a scan-line order. An example of writing a FCE expression for a 4 4 image is shown in Fig. 2. In this figure, the sample image is divided FCE expressions 2 1 1 1 1 1 If we use m bits to represent the A ( log 2 (A) m) different pixel values, n bits to represent each different full pixel value, the length of FCE, L, represented in bits, for this square region with edge size of s will be, L = m + A n + s 2 log 2 (A) (1) For ease of extracting the first field in FCE, we make m as a constant that is not less than log 2 (s 2 ) as A s 2. As FCE has a header composed of a pixel value queue and A, its length may be longer than using full pixel values for each pixel presentation. Whether FCE representation is shorter depends on the size of the square region s and the number of different pixel values A, which reflects the temporal redundancies for the images. Hence, if we want to make sure that L is shorter than that used by full pixel value presentation, we need 2 log 2 (s) + A n + s 2 log 2 (A) s 2 n (2) If we let n =8for the inequality, as used in most machines, the figure shown in Fig. 3 will reflect the relation between A and s. Only those values under the curve should be chosen for A to satisfy Equ. 2. And at the same time, if we know the pixel value distribution of the image, we will be able to select square edge size s. That is, if we denote β as the compression ratio, we have, β(a, s) = s 2 n 2 log 2 (s) + A n + s 2 log 2 (A) From Equ. 3, the compression ratio β depends on the square size s and A. The optimal size s for the squares is where the highest compression ratio is attained. Suppose the density (the ratio of the number of different pixel values and the total number of pixels) of an image is.1, which is typical for a smoothtoned image, this relation between ratio gain and s is shown in Fig. 4. We can only choose those values under the curve for compression ratio β and s. In order to satisfy Equ. 2 and gain as much compression ratio β as possible, we need to find the crossing point for these two curves in Fig. 3 and Fig. 4. We can see that 16 is one of the best choices for image in FCE format with shortest length. That means Equ. 2 and Equ. 3 sets up the theoretical basis for us to choose the square edge size. (3) Relation between s and A 2 1 1 1 14 relation.dat 12 Fig. 2. Compressed file: 2 1 1 2 1 1 1 1 1 1 send to gzip for further compression Sample of using FCE for image compression. compressed file number of different pixel values 1 8 6 4 2 into 4 squares whose edge is of 2-pixel long. There are 4 FCE expressions corresponding to these 4 squares. Each FCE uses its own pixel value queue to represent the pixel values in the 2 2 square. Fig. 3. 5 1 15 2 square edge size s Relation between square edge and number of different pixel value
Fig. 4. 1 times of ratio 1 8 6 4 2 Relation between s and ratio 5 1 15 2 square edge size s Relation between square edge and ratio relation2.dat IV. COMPRESSION ALGORITHM From the discussions on FCE format above, we know that, if the a region can be approximated with fewer different pixel values, we can use the FCE expression to compress it. The basis for this approach is the assumption that neighboring pixel values have much similarity. A. Compression Algorithm From discussion from Section III, we choose 16 for s. Assume the give image has 3 planes, called R, G, and B. The outline of the compression algorithm is as follows, 1. For each RGB plane, do the following: 1.1 Decompose image on the plane into squares of size 16 16; 1.2 Allocate a matrix M r,g,b to store offsets for all pixel values; 1.3 For each square i, allocate a queue Q i to store different pixel values in this square. 2. For each square i in each plane, do the following, 2.1 Put all different pixel values in square i into queue Q i ; 2.2 Record offsets for all pixel values in square i, into M r,g,b. 3. Record M r,g,b and all Qs for each plane. 4. Use gzip to further compress M and Q. An pseudocode of the algorithm that describe this procedure is given below. Let the number of pixels be S, we allocate matrix R, G, B of size S to contain pixel R, G, and B values, we also allocate offset matrix M r, M g, and M b of size S to hold pixel value offsets in the corresponding square queues; For each square, we allocate a pixel value queue Q. And, for each queue Q, a corresponding array A, which contains number of different pixel values, is initialized to all s. for each R, G, and B planes do for the ith and jth square do if RGB value in this square in Q i,j then record offset in M; else insert value in Q i,j ; A i,j ++; record offset in M; end if for each square in each planes do write A i,j into compressed file; write Q i,j into compressed file; write M into compressed file; Using gzip for each compressed file in each plane. An example showing how the algorithm works on previous sample is shown in Fig. 2. It shows the case for one plane. We put A value for each square first in the compressed file, then, we put all different pixel values into the file, and the offset for each pixel value follows. There are three issues for us to notice in our algorithm, which effectively make the file containing FCE expressions favor gzip compression. 1. In order to make the file containing FCE expressions better for gzip compression, we put FCE expressions for each plane one by one. And for each plane, instead of putting FCE for each square one by one, we put the pixel values for all the squares together, then all the offsets. This approach under gzip attains better compression. It depends on the property that neighboring square may have similar number of total different pixel values and offsets. 2. We output offsets as bits instead of as bytes in order to save space. When neighboring pixels are similar to each other, FCE encoding gains a lot of compression because each pixel just needs log 2 (A) bits. If A is 1, the whole square region just needs one unit of pixel value in FCE expression. 3. For ease of decoding pixel values, we fix the size of A to max(a i ) for all the squares. In our experiments, we employ log 2 (s 2 ) bits, which is s bytes, for A. From the discussion above, we know that the advantages of using FCE is based on the assumption that the density for different pixel values in the image is not large. Given one image with its content of RGB values, the question is whether it is possible to convert to a representation with even less density. If the conversion is good enough, we can achieve a near-lossless solution with even better compression ratio. B. Color Space Plane Conversion The formula is based on Julien s work [6]. The YUV planes give more similarity among the values for each square. That is, Y =.299 R +.587 G +.114 B (4) with its reciprocal versions: U =(B Y ).565 (5) V =(R Y ).713 (6) R = Y +1.43 V (7) G = Y.344 U.714 V (8) B = Y +1.77 U (9) Consider the color space plane conversions, we have some experimental data to show that conversion from RGB planes to
airplane baboon fruits lena peppers A (R plane) 49 84 41 48 53 var (R plane) 34.23 29.42 25.19 29.94 25.71 A (G plane) 5 93 46 56 57 var (G plane) 36.95 2.29 25.38 29.36 31.93 A (B plane) 38 94 55 51 54 var (B plane) 27.7 29.1 24.11 24.61 28.88 A (Y plane) 49 86 45 52 54 var (Y plane) 34.99 29.39 24.49 29.86 28.88 A (U plane) 18 43 21 22 28 var (U plane) 13.61 12.5 1.55 9.66 11.34 A (V plane) 13 4 15 22 32 var (V plane) 9.36 13.73 9.81 9.44 28.88 TABLE I COLOR PLANE CONVERSION STATISTICS YUV planes can make neighboring pixel values more similar to each other in the following table. The table shows how many different pixel values A for each plane and what the variance var is for each square of 16 16. From the table, we can see that Y plane has similar A value and variance, compared with R plane. However, U and V planes have very smaller A and variances, compared with G and B planes for the same images. This assures us that this conversion is good for FCE expressions in lossy compression. Also, for the conversion from RGB to YUV in the formula, encoding and decoding result in a loss, however the error is bounded to be less than 5 for all 256 possible values each color can assume. As this error hardly has any visual effect, the conversion is near-lossless, and the converted data is more amenable to compression. V. EXPERIMENTAL RESULTS To evaluate the performance of using FCE, we compared its compression performance and coding complexity against several other popular compression methods for toned-images. For desktop-like screen updates, FCE gains much over compression algorithms favoring toned-images. We only consider multimedia application coding only. These methods include JPEG-LS, lossless JPEG using Huffman coding, PNG, gzip, lossy JPEG from IJG with quality equal to 1, and GIF, which is lossy due to the conversion from 24-bit to 8-bit color. We show results for using FCE in both RGB and YUV color space. We used 5 different images for our measurements, which are from a standard collection of test images [7]. The measurements were performed on an IBM NetVista PC with a 1 GHz AMD Athlon CPU and 256 MB RAM, running RedHat Linux 7.1. Table II shows the compression results in terms of total image size after compression for each compression algorithm on each of the 5 images. Table III shows the corresponding compression ratio for each algorithm. Tables IV and V show the coding complexity results in terms of encoding and decoding time, respectively for each compression algorithm on each of the 5 images. Tables II and III show that FCE consistently achieves much faster encoding and modest decoding with similar compression ratio. Compared with OLI [3], FCE gains better compression ratio and faster coding for multimedia applications. We also measured the performance of FCE-RGB and FCE-YUV with other tile sizes. Our measurements confirmed our earlier analysis that showed a 16 16 size square region for FCE expressions provided the best FCE compression performance. VI. CONCLUSIONS AND FUTURE WORK In this paper, we first introduce a fast content expression to describe screen updates content. Then, we apply it to improve lossless and near-lossless compression for multimedia applications. The new lossless image compression algorithm integrates well with the popular gzip compression utility. We have implemented our algorithm and compared it with other popular coding methods, including JPEG-LS, JPEG, gif, gzip, and various combinations. Our results show that our approach provides superior coding complexity with modest compression, which gains its suitability in server-based computing. REFERENCES [1] Citrix Systems, Citrix MetaFrame 1.8 Backgrounder Citrix While Paper, June 1998. [2] B. C. Cumberland and G. Carius, Microsoft Windows NT Server 4., Terminal Server Edition: Technical Reference Microsoft Press, Redmond, WA, August 1999. [3] F. Li and J. Nieh, Optimal Linear Interpolation for Server-based Computing, in Proceedings of IEEE International Conference on Communications, New York, NY, USA, April 22, pp. 2542-2546. [4] J. Nieh and S. J. Yang, Measuring the Multimedia Performance of Server- Based Computing Proceedings of the Tenth International Workshop on Network and Operating System Support for Digital Audio and Video, Chapel Hill, NC, June 2. [5] T. Richardson, Q. Stafford-Fraser, K. R. Wood and A. Hopper, Virtual Network Computing IEEE Internet Computing, Vol. 2, No. 1, January/February 1998. [6] YUV conversion, http://www.webartz.com/fourcc/fccyvrgb.htm [7] Standard test images, http://www.geocities.com/siliconvalley/lakes/6686/test-images/ [8] The Santa Cruz Operation, Tarantella Web-Enabling Software: The Adaptive Internet Protocol A SCO Technical While Paper, December 1998. [9] Sun Ray 1 Enterprise Appliance, Sun Microsystems http://www.sun.com/products/sunray1
airplane baboon fruits lena peppers no compression 786,432 786,432 786,432 786,432 786,432 JPEG-LS 387,964 66,71 46,58 446,342 467,448 JPEG Huffman 45,33 54,577 59,636 56,269 55,798 PNG 475,634 648,65 491,67 525,763 556,653 gzip 577,975 786,492 572,525 733,192 731,298 JPEG (IJG quality = 1) 213,83 342,881 224,464 229,952 248,633 GIF 298,66 298,66 298,66 298,66 298,66 FCE-RGB 475,122 727,87 487,718 521,897 535,592 FCE-YUV 391,579 582,61 41,44 452,4 496946 TABLE II IMAGE SIZE IN BYTES AFTER COMPRESSION airplane baboon fruits lena peppers AVERAGE JPEG-LS 2.3 1.29 1.93 1.76 1.68 1.89 JPEG Huffman 1.74 1.45 1.54 1.4 1.42 1.52 PNG 1.65 1.21 1.59 1.49 1.41 1.63 gzip 1.35 1. 1.37 1.7 1.7 1.38 JPEG (IJG quality = 1) 3.67 2.29 3.5 3.41 3.15 3.65 GIF 2.63 2.63 2.63 2.63 2.63 3.17 FCE-RGB 1.48 1.55 1.5 1.47 1.47 1.49 FCE-YUV 1.79 1.67 1.83 1.69 1.58 1.7 TABLE III COMPRESSION RATIOS airplane baboon fruits lena peppers JPEG-LS 12 21 26 19 21 JPEG Huffman 32 29 32 32 36 PNG 16 15 19 22 21 gzip 2 25 32 24 28 JPEG (IJG quality = 1) 27 31 28 35 3 GIF 21 154 189 19 19 FCE-RGB 45 46 46 45 45 FCE-YUV 98 1 1 1 99 TABLE IV ENCODING TIME COMPLEXITY IN MILLISECONDS airplane baboon fruits lena peppers JPEG-LS 11 14 18 16 15 JPEG Huffman 39 37 33 36 41 PNG 11 13 16 155 167 gunzip 3 4 36 5 42 JPEG (IJG quality = 1) 18 19 21 2 19 GIF 14 115 13 169 15 FCE-RGB 17 324 19 156 11 FCE-YUV 157 365 158 158 323 TABLE V DECODING TIME COMPLEXITY IN MILLISECONDS.