WITH the advent of video coding standards, such as

Size: px
Start display at page:

Download "WITH the advent of video coding standards, such as"

Transcription

1 3 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 58, NO. 7, JULY 011 A Reconfigurable Multi-Transform VLSI Architecture Supporting Video Codec Design Kanwen Wang, Jialin Chen, Wei Cao, Ying Wang, Lingli Wang, Member, IEEE, and Jiarong Tong Abstract This brief presents a reconfigurable VLSI architecture which is designed for multi-transform codec in several video coding standards of MPEG-/, VC-1, H.6/AVC and AVS. The reconfigurable multiple constant multiplication algorithm with two fusing strategies is provided to generate constant multipliers in the matrix calculation blocks. Additionally, adder-sharing strategy is adopted in the unified preprocessing/ postprocessing block to save circuit areas. The proposed architecture can support different standards through static reconfiguration and forward/inverse transform functions through dynamic reconfiguration. It is suitable for the real-time processing of 1080P HD video codec with six video standards transforms. Index Terms Adder-sharing, computer-aided design (CAD), dynamic reconfiguration, multi-transform, reconfigurable multiple constant multiplication (RMCM), static reconfiguration. I. INTRODUCTION WITH the advent of video coding standards, such as Motion Pictures Expert Group (MPEG-/), Windows Media Video 9 (VC-1), Advanced Video Coding (H.6/AVC), and Audio Video Coding Standard of China (AVS), there is an urgent need to integrate them into a single chip. Simply combining multi-standard codec circuits together will increase the silicon area and power consumption, which makes the design unacceptable. When looking deeply into the principle of encoding and decoding process, many sharing logic can be found. Moreover, the basic algorithms of different compression standards are alike in spite of their own coding methods. Therefore, through reconfigurations, different standards may be efficiently incorporated. In video compression, discrete cosine transform (DCT) is widely used because it concentrates signal information in a few low-frequency components. To meet the requirement of real-time processing, hardware implementations of -D DCT/inverse DCT (IDCT) are adopted. The -D DCT/IDCT can be implemented with the 1-D DCT/IDCT and a transpose memory in a row column decomposition manner. However, DCT requires float-point multiplication, which will cause precision problems for hardware implementations. Hence, transform without float-point multiplication, namely, integer transform, has been raised. It is similar, but not identical, to DCT. Integer transform is employed in all video coding standards, except Manuscript received March 15, 011; accepted April 5, 011. Date of current version July 0, 011. This work was supported in part by the State Key Laboratory of ASIC and System Research Program of Fudan University under Grant 09ZD005 and Grant 09XT00, by the National 863 Program of China under Grant 009AA0101, and by the Fundamental Research Funds for the Central Universities. This paper was recommended by Associate Editor B.-D. (Brian) Liu. The authors are with the State Key Laboratory of ASIC and System, Fudan University, Shanghai 0103, China. Corresponding author: Wei Cao ( caow@fudan.edu.cn). Digital Object Identifier /TCSII MPEG-/. The likeness of transform matrixes among different coding standards may be shared. In the literature, many multi-transform designs have been published. A low-cost VLSI architecture is designed for multistandard inverse transform in [5]. In [6], the delta matrix is employed for sharing the inverse multi-transform. In [7], the fast 1-D integer forward/inverse transforms for VC-1 is proposed. However, they were either designed for video decoders or finite standards. On the other hand, finding an optimal solution for those designs takes a large search space. Usually, it is accomplished with computer-aided design (CAD). In [] and [3], such tools have been developed for multiple constant multiplication (MCM). However, the solutions can only be applied for multiplication with multiple outputs or reconfigurable single output. In this brief, a multitransform VLSI architecture utilizing the reconfigurable MCM (RMCM) algorithm is proposed for the real-time processing of 1080P HD video, which can support both forward and inverse transforms of MPEG-/, VC-1, H.6/AVC, and AVS. The rest of this brief is organized as follows: Section II provides reviews of 1-D multi-transform. Section III presents the proposed circuit architecture along with the RMCM algorithm and adder-sharing strategy. The VLSI implementation results and comparisons are given in Section IV. Finally, Section V concludes this brief. II. REVIEWS OF 1-D MULTI-TRANSFORM In video coding standards, 8 8 transform coding is required in MPEG-/, VC-1, H.6/AVC, and AVS, whereas transform coding is required in VC-1 and H.6/AVC. The 1-D 8-point forward transform coefficient matrix is described as follows: a a a a a a a a b c d e e d c b f g g f f g g f c e b d d b e c C tran_8 =. (1) a a a a a a a a d b e c c e b d g f f g g f f g e d c b b c d e In total, there are 6 coefficients in this matrix, with 7 different values of a to g. The inverse transform matrix is the transposed form of the forward transform matrix. By using fast algorithm from [1], an 8-point forward transform matrix can be decomposed into two -point forward transform matrices, which are a a a a b c d e f g g f c e b d U tran_ = V a a a a tran_ =. d b e c g f f g e d c b () /$ IEEE

2 WANG et al.: RECONFIGURABLE MULTI-TRANSFORM VLSI ARCHITECTURE SUPPORTING VIDEO CODEC DESIGN 33 TABLE I COEFFICIENTS AMONG DIFFERENT VIDEO CODING STANDARDS Note that the -point U matrix is also used in -point transform coding. The 1-D 8-point transform is illustrated as where Y 8 =C 8 X 8 (3) Y 8 = [ y0 y1 y y3 y y5 y6 y7 ] T X 8 = [ x0 x1 x x3 x x5 x6 x7 ] T. C 8 is the 8-point transform coefficient matrix. The 1-D 8-point forward transform can be decomposed as y0 x0+x7 y1 x0 x7 y x1+x6 y3 x1 x6 =U y tran_ =V x+x5 y5 tran_. () x x5 y6 x3+x y7 x3 x In addition, the 1-D inverse transform is expressed as y0 x0 x1 y1 =U T x tran_ +V T x3 tran_ y x x5 y3 x6 x7 y7 x0 x1 y6 =U T x tran_ V T x3 tran_. (5) y5 x x5 y x6 x7 In order to get the results of matrix calculation, lots of multiplication and addition are required. The positions and signs of coefficients a to g are the same, although each standard has its own coefficient values in the transform matrix. The coefficients in MPEG-/ here are using 10-bit fixed-point numbers. According to [5], this is the minimum bitwidth meeting the constraint of IEEE Standard Additionally, the inverse transform of H.6/AVC must conform to the data flow defined in the standard to avoid mismatch problems. The transform matrices of such data flow are inferred to as Uh6 T 1 1 = Vh6 T 1 0 = (6) The other coefficients are using integer numbers defined by the standards. Table I summarizes all coefficients needed among different video coding standards. Fig. 1. Proposed 1-D multi-transform VLSI architecture. III. PROPOSED 1-D MULTI-TRANSFORM CIRCUIT ARCHITECTURE The proposed architecture is shown in Fig. 1. It mainly consists of the U matrix calculation block, the V matrix calculation block, the preprocessing/postprocessing block, the adder tree block, and two mux blocks. Three pipeline stages are applied at preprocessing/postprocessing, matrix calculation, and adder tree blocks, as indicated by dashed lines. The matrix calculation is multiplierless, which is made of only adders and shifters. Two kinds of constant multipliers are used to calculate each term of the matrix product. That is, an AFG constant multiplier is in charge of U matrix calculations, and a BCDE constant multiplier is in charge of V matrix calculations. The constant multipliers are responsible for calculating coefficients in parallel and can be reconfigured to support different standards. Finding the optimal solution for MCM problems, i.e., the one with the fewest number of adders is known to be NP-complete []. Voronenko and Puschel [] proposed a heuristic graphbased algorithm, which could generate an optimal directed acyclic graph (DAG) for given multiple constants with multiple outputs. In [3], an algorithm for time-multiplexed MCM was further presented, which could produce a single output for multiple constants based on the input control. Unfortunately, neither of them could produce architectures with reconfigurable multiple outputs. Here, a novel RMCM algorithm, which is based on [] and [3], is given. An integrated tool is developed with two fusing strategies to support this algorithm. This tool is able to produce architectures with reconfigurable multiple outputs. A small example is demonstrated here. In Fig., two optimal MCM DAGs are displayed for multiplication, i.e., {y0 = 5x, y1 = 19x} in (a) and {y = 36x, y3 = x} in (b). Strategy 1 performs node assignment and edge merging from DAG_B to DAG_A. Usually, adders and subtractors will be assigned to their counterparts, but it is possible that configurable adders/subtractors are given. Edge merging appends multiplexers to the inputs of nodes where necessary. Strategy first tries to reconstruct DAG_A and DAG_B to find more common nodes (e.g., 9x) and then performs the same steps in strategy 1. The reconstructed DAG_C and DAG_D are shown in (c) and (d). They may not be the optimal MCM DAGs, but more common nodes can reduce the cost of multiplexers. (e) and (f) are the generated RMCM

3 3 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 58, NO. 7, JULY 011 Fig.. (a) Optimal MCM DAG_A for outputs y0 and y1. (b) Optimal MCM DAG_B for outputs y and y3. (c) Reconstructed MCM DAG_C for outputs y0 and y1. (d) Reconstructed MCM DAG_D for outputs y and y3. (e) RMCM DAG_E using strategy 1 for outputs y0 and y13. (f) RMCM DAG_F using strategy for outputs y0 and y13. DAGs using strategies 1 and, respectively. Outputs y0 and y are fused to y0, whereas outputs y1 and y3 are fused to y13. In (e), with both multiplexers set to select the left inputs, the active datapaths correspond to (a), and with both multiplexers set to the right input, they correspond to (b). In this example, using strategy, (f) has a better area result. The following pseudocodes describe how two MCM DAGs, i.e., DAG_A and DAG_B, are fused into a RMCM DAG. FuseMCMDAGs (DAG_A, DAG_B) min _cost = MAX_COST rmcm_dag = NULL for all node mappings of DAG_B to DAG_A do dag = FuseDAGs(DAG_A, DAG_B) cost = EstimateCost(dag) if cost < min _cost then min _cost = cost rmcm_dag = dag end if end for // Strategy 1 (DAG_C, DAG_D)=ReconstructDAG(DAG_A, DAG_B) for all node mappings of DAG_D to DAG_C do dag = FuseDAGs(DAG_C, DAG_D) cost = EstimateCost(dag) if cost < min _cost then min _cost = cost rmcm_dag = dag end if end for // Strategy return rmcm_dag Fig. 3. Proposed AFG constant multiplier. The node mappings enumerate all possibilities and respect the ordering of the nodes in both MCM DAGs. The output nodes of each DAG are also aligned. The FuseDAGs process includes both node assignment and edge merging. The EstimateCost process will evaluate the cost of the FuseDAGs process according to the information from the technology library being used. The RMCM algorithm chooses from the results of two fusing strategies and guarantees that the number of adder nodes will not exceed the largest number of the initial DAG being fused, and the least cost of multiplexers are appended. The hardware structure of coefficient values in the matrix for multiplication can be easily generated by the RMCM tool in the form of hardware description language. The tool will first construct a DAG for each set of coefficients for one standard. Then, every two DAGs (e.g., one DAG with 36, 73, and 196 in the U matrix for MPEG-/ and the other DAG with 17,, and 10 in the U matrix for -point VC-1) will be tried fusing with the two aforementioned strategies to get a lower area result. The minimum cost of the RMCM DAG is the final structure of the constant multiplier. It is capable of generating multiple outputs in parallel and changing values by different configurations. The structure of the AFG constant multiplier is shown in Fig. 3. There are five adder nodes in total in the AFG constant multiplier with a depth of three. It consumes 95% more area than the AFG constant multiplier design that requires only MPEG-/ coefficients. This is the overhead caused by multiplexing circuit for reconfigurability. The structure of the BCDE constant multiplier is shown in Fig.. There are six adder nodes in total in the BCDE constant multiplier with a depth of three, whereas there are seven adder nodes in [5]. The reconfigurability overhead for the BCDE constant multiplier design is 71% more area. Each node has a name and represents an intermediate value, which is labeled with brackets. Note that by different configurations, they may produce different values, which are separated by commas. All node calculation steps are shownintableii. The final matrix coefficient can be easily acquired by using intermediate nodes, as expressed in Table III. For example, coefficient 36 is the result of left-shift 1 bit of 181, which is the AFG node value of sub_31. Moreover, 181 is the result of 15 subtracted from left-shift bits of 9, which is the AFG node value of sub_1 and add_sub_11. Then, the full expression of 36 is (((6 (16 1)) ) (16 1)) 1. The preprocessing block is used to realize the butterfly structure of forward transform and the permutation structure of inverse transform, whereas the postprocessing block is used to compute the butterfly structure of inverse transform

4 WANG et al.: RECONFIGURABLE MULTI-TRANSFORM VLSI ARCHITECTURE SUPPORTING VIDEO CODEC DESIGN 35 Fig.. Proposed BCDE constant multiplier. TABLE II NODE CALCULATION STEPS Fig. 5. Proposed unified preprocessing/postprocessing architecture. TABLE IV HARDWARE COST OF 1-D MULTI-TRANSFORM ARCHITECTURE TABLE III COEFFICIENTS CALCULATION STEPS TABLE V THREE TRANSFORM FUNCTIONS OF IMPLEMENTATION and the permutation structure of forward transform. In [], they are designed in separate. Here, a unified preprocessing/postprocessing block is presented using adder-sharing strategy, which means that forward and inverse transform share a common butterfly structure and permutation structure. By this way, eight adders are saved with eight multiplexers. Fig. 5 shows this architecture. The adder tree block is used to obtain the sum of the matrix product. The sign positions of the forward and inverse transform addition are the same for two -point matrices; thus, only the input signals to adders will be changed. In addition, mux input and output blocks are responsible for the -/8-point and forward/inverse transform selections. The hardware of H.6/AVC inverse transform is separately designed. The 8-point inverse transform requires 3 adders [11]. These adders could be shared in the proposed architecture with the number of in the matrix calculation block, 0 in the adder tree block, and 8 in the preprocessing/postprocessing block. The operations of x/, 3x/, and x/ in the matrix are performed with x 1, x+(x 1), and x, respectively. In detail, x/ is finished in four AFG constant multiplier blocks, whereas 3x/ is done by add_11 in four BCDE constant multiplier blocks. In the adder tree block, the additions of the U matrix products remain unchanged, but those of the V matrix products require extra multiplexers to select add/sub operations and accomplish x/. The adders in the preprocessing/ postprocessing block are shared to perform butterfly operations without any changes. IV. VLSI IMPLEMENTATION RESULTS AND COMPARISONS The proposed design is described in Verilog HDL, modeled by MATLAB and verified inside Synopsys VCS with MAT- LAB data. The verification of H.6/AVC is done with the help of JM 17. reference software. It is synthesized using Synopsys Design Compiler with SMIC 130-nm standard cell library. Table IV gives the design result. It is shown that the U and V matrix calculations account for about 60% of the whole architecture; therefore, their improvements are very critical. In order to demonstrate the advantages of the reconfigurable architecture, the forward and inverse transforms are implemented, respectively; all of three results are listed in Table V.

5 36 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 58, NO. 7, JULY 011 TABLE VI 1-D MULTITRANSFORM ARCHITECTURE COMPARISONS It is noted that a unified architecture can perform functions of both forward and inverse transforms through dynamic reconfiguration, instead of using two architectures. Accordingly, 31% areas can be saved. Qi et al. [5] also achieved constant multipliers. The BCDE subunit in Fig. from [5] is reimplemented here and shows % more area than the proposed RMCM-based BCDE constant multiplier. The number of total adder that counts for the proposed inverse transform architecture is 66, whereas in [5] and [6], this numbers are 7 (the number 70 in [5] is wrong, which has been confirmed by the author) and 11. As a result, 6 and 6 adders are saved, respectively. In addition, with the help of CAD tools, the RMCM-based constant multiplier can be extended to support more video coding standards. Table VI shows comparisons between the proposed and existing designs. As far as the authors knowledge, no architecture can support the codec design of all six video standards transforms. Under the same 130-nm technology, the aforementioned presented inverse transform architecture can save 1% and 7% area with more supporting standards when compared with that of [5] and [6]. Furthermore, the proposed reconfigurable multitransform VLSI architecture requires about 8% and 1% area penalty. The throughputs of the proposed 1-D architecture are 8 pixels/cycle with 8-point transform and pixels/cycle with -point transform as it computes the transform data in parallel. Since the architecture applies three pipeline stages, it takes 10 (3 + 7) cycles to finish the processing of eight 1-D 8-point transforms in the row (column) order, considering the pipeline latency. Two proposed 1-D multi-transform architectures can be used along with a modified transpose memory [10] to construct a -D multi-transform architecture. The total latency for a -D transform is 10 (3++3) cycles. Suppose the architecture is fully pipelined, it needs 105 ( ) cycles to process a -D transform of a ::0 macroblock. This number is larger than that of a -D 8 8 transform. Therefore, the real-time analysis is based on -D transforms. When used inside a decoder, the proposed multi-transform VLSI architecture can process 1080P at 60-Hz HD video bitstream under 55-MHz ( /(16 16)) working frequency. When used inside an encoder, both forward and inverse transform functions are required. This is achieved through dynamic reconfiguration. Considering one cycle of reconfiguration time, it needs 11 (105 +1)cycles to process a -D transform of a ::0 macroblock. Thus, it can process 1080P at 30-Hz HD video bitstream under 55-MHz ( /(16 16)) working frequency. Hence, the multi-transform architecture can be utilized into the real-time HD video codec to process MPEG- /, VC-1, H.6/AVC, and AVS video bitstreams. V. C ONCLUSION In this brief, a reconfigurable multi-transform VLSI architecture supporting video codec design has been presented. The reconfigurability of the architecture is reflected in two ways: 1) Matrix coefficients from different standards can be statically reconfigured based on the RMCM algorithm, and ) forward and inverse transforms can be dynamically reconfigured with the adder-sharing design of the preprocessing/postprocessing blocks. Moreover, the RMCM algorithm can be used to find optimal solutions to support more standards. This architecture is suitable for the transform processing with 1080P HD video codec design of MPEG-/, VC-1, H.6/AVC, and AVS. ACKNOWLEDGMENT The authors would like to thank the anonymous reviewers, whose advice helped to improve the quality of this brief. REFERENCES [1] W. H. Chen, C. H. Smith, and S. C. Fralick, A fast computational algorithm for the discrete cosine transform, IEEE Trans. Commun., vol. COM-5, no. 9, pp , Sep [] Y. Voronenko and M. Puschel, Multiplierless multiple constant multiplication, ACM Trans. Algorithms, vol. 3, no., p. 11, May 007. [3] P. Tummeltshammer, J. C. Hoe, and M. Puschel, Time-multiplexed multiple-constant multiplication, IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 6, no. 9, pp , Sep [] J. I. Guo, R. C. Ju, and J. W. Chen, An efficient -D DCT/IDCT core design using cyclic convolution and adder-based realization, IEEE Trans. Circuits Syst. Video Technol., vol. 1, no., pp. 16 8, Apr. 00. [5] H. Qi, Q. Huang, and W. Gao, A low-cost very large scale integration architecture for multistandard inverse transform, IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 57, no. 7, pp , Jul [6] S. Lee and K. Cho, Architecture of transform circuit for video decoder supporting multiple standards, Electron. Lett., vol., no., pp. 7 75, Feb [7] C. P. Fan and G. A. Su, Fast algorithm and low-cost hardware-sharing design of multiple integer transforms for VC-1, IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 56, no. 10, pp , Oct [8] G. A. Su and C. P. Fan, Low-cost hardware-sharing architecture of fast 1-D inverse transforms for H.6/AVC and AVS applications, IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 55, no. 1, pp , Dec [9] K. Kim and J. S. Koh, An area efficient DCT architecture for MPEG- video encoder, IEEE Trans. Consum. Electron., vol. 5, no. 1, pp. 6 67, Feb [10] T. C. Wang, Y. W. Huang, H. C. Fang, and L. G. Chen, Parallel D transform and inverse transform architecture for MPEG- AVC/H.6, in Proc. IEEE ISCAS, May 003, pp [11] Y. C. Chao, H. H. Tsai, Y. H. Lin, J. F. Fang, and B. D. Liu, A novel design for computations of all transforms in H.6/AVC decoders, in Proc. IEEE ICME, Jul. 007, pp

Computation of Forward and Inverse MDCT Using Clenshaw s Recurrence Formula

Computation of Forward and Inverse MDCT Using Clenshaw s Recurrence Formula IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 51, NO. 5, MAY 2003 1439 Computation of Forward Inverse MDCT Using Clenshaw s Recurrence Formula Vladimir Nikolajevic, Student Member, IEEE, Gerhard Fettweis,

More information

Error Detection and Data Recovery Architecture for Systolic Motion Estimators

Error Detection and Data Recovery Architecture for Systolic Motion Estimators Error Detection and Data Recovery Architecture for Systolic Motion Estimators L. Arun Kumar #1, L. Sheela *2 # PG Scholar, * Assistant Professor, Embedded System Technologies, Regional Center of Anna University

More information

Implementation of Full -Parallelism AES Encryption and Decryption

Implementation of Full -Parallelism AES Encryption and Decryption Implementation of Full -Parallelism AES Encryption and Decryption M.Anto Merline M.E-Commuication Systems, ECE Department K.Ramakrishnan College of Engineering-Samayapuram, Trichy. Abstract-Advanced Encryption

More information

Floating Point Fused Add-Subtract and Fused Dot-Product Units

Floating Point Fused Add-Subtract and Fused Dot-Product Units Floating Point Fused Add-Subtract and Fused Dot-Product Units S. Kishor [1], S. P. Prakash [2] PG Scholar (VLSI DESIGN), Department of ECE Bannari Amman Institute of Technology, Sathyamangalam, Tamil Nadu,

More information

Design and Implementation of Concurrent Error Detection and Data Recovery Architecture for Motion Estimation Testing Applications

Design and Implementation of Concurrent Error Detection and Data Recovery Architecture for Motion Estimation Testing Applications Design and Implementation of Concurrent Error Detection and Data Recovery Architecture for Motion Estimation Testing Applications 1 Abhilash B T, 2 Veerabhadrappa S T, 3 Anuradha M G Department of E&C,

More information

DESIGN OF AN ERROR DETECTION AND DATA RECOVERY ARCHITECTURE FOR MOTION ESTIMATION TESTING APPLICATIONS

DESIGN OF AN ERROR DETECTION AND DATA RECOVERY ARCHITECTURE FOR MOTION ESTIMATION TESTING APPLICATIONS DESIGN OF AN ERROR DETECTION AND DATA RECOVERY ARCHITECTURE FOR MOTION ESTIMATION TESTING APPLICATIONS V. SWARNA LATHA 1 & K. SRINIVASA RAO 2 1 VLSI System Design A.I.T.S, Rajampet Kadapa (Dt), A.P., India

More information

Study and Implementation of Video Compression Standards (H.264/AVC and Dirac)

Study and Implementation of Video Compression Standards (H.264/AVC and Dirac) Project Proposal Study and Implementation of Video Compression Standards (H.264/AVC and Dirac) Sumedha Phatak-1000731131- sumedha.phatak@mavs.uta.edu Objective: A study, implementation and comparison of

More information

An Efficient Architecture for Image Compression and Lightweight Encryption using Parameterized DWT

An Efficient Architecture for Image Compression and Lightweight Encryption using Parameterized DWT An Efficient Architecture for Image Compression and Lightweight Encryption using Parameterized DWT Babu M., Mukuntharaj C., Saranya S. Abstract Discrete Wavelet Transform (DWT) based architecture serves

More information

Study and Implementation of Video Compression standards (H.264/AVC, Dirac)

Study and Implementation of Video Compression standards (H.264/AVC, Dirac) Study and Implementation of Video Compression standards (H.264/AVC, Dirac) EE 5359-Multimedia Processing- Spring 2012 Dr. K.R Rao By: Sumedha Phatak(1000731131) Objective A study, implementation and comparison

More information

THE Walsh Hadamard transform (WHT) and discrete

THE Walsh Hadamard transform (WHT) and discrete IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS, VOL. 54, NO. 12, DECEMBER 2007 2741 Fast Block Center Weighted Hadamard Transform Moon Ho Lee, Senior Member, IEEE, Xiao-Dong Zhang Abstract

More information

Design and Analysis of Parallel AES Encryption and Decryption Algorithm for Multi Processor Arrays

Design and Analysis of Parallel AES Encryption and Decryption Algorithm for Multi Processor Arrays IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 5, Issue, Ver. III (Jan - Feb. 205), PP 0- e-issn: 239 4200, p-issn No. : 239 497 www.iosrjournals.org Design and Analysis of Parallel AES

More information

FPGA. AT6000 FPGAs. Application Note AT6000 FPGAs. 3x3 Convolver with Run-Time Reconfigurable Vector Multiplier in Atmel AT6000 FPGAs.

FPGA. AT6000 FPGAs. Application Note AT6000 FPGAs. 3x3 Convolver with Run-Time Reconfigurable Vector Multiplier in Atmel AT6000 FPGAs. 3x3 Convolver with Run-Time Reconfigurable Vector Multiplier in Atmel AT6000 s Introduction Convolution is one of the basic and most common operations in both analog and digital domain signal processing.

More information

An Efficient RNS to Binary Converter Using the Moduli Set {2n + 1, 2n, 2n 1}

An Efficient RNS to Binary Converter Using the Moduli Set {2n + 1, 2n, 2n 1} An Efficient RNS to Binary Converter Using the oduli Set {n + 1, n, n 1} Kazeem Alagbe Gbolagade 1,, ember, IEEE and Sorin Dan Cotofana 1, Senior ember IEEE, 1. Computer Engineering Laboratory, Delft University

More information

A Survey of Video Processing with Field Programmable Gate Arrays (FGPA)

A Survey of Video Processing with Field Programmable Gate Arrays (FGPA) A Survey of Video Processing with Field Programmable Gate Arrays (FGPA) Heather Garnell Abstract This paper is a high-level, survey of recent developments in the area of video processing using reconfigurable

More information

The implementation and performance/cost/power analysis of the network security accelerator on SoC applications

The implementation and performance/cost/power analysis of the network security accelerator on SoC applications The implementation and performance/cost/power analysis of the network security accelerator on SoC applications Ruei-Ting Gu grating@eslab.cse.nsysu.edu.tw Kuo-Huang Chung khchung@eslab.cse.nsysu.edu.tw

More information

RN-Codings: New Insights and Some Applications

RN-Codings: New Insights and Some Applications RN-Codings: New Insights and Some Applications Abstract During any composite computation there is a constant need for rounding intermediate results before they can participate in further processing. Recently

More information

Reconfigurable Low Area Complexity Filter Bank Architecture for Software Defined Radio

Reconfigurable Low Area Complexity Filter Bank Architecture for Software Defined Radio Reconfigurable Low Area Complexity Filter Bank Architecture for Software Defined Radio 1 Anuradha S. Deshmukh, 2 Prof. M. N. Thakare, 3 Prof.G.D.Korde 1 M.Tech (VLSI) III rd sem Student, 2 Assistant Professor(Selection

More information

Performance Analysis and Comparison of JM 15.1 and Intel IPP H.264 Encoder and Decoder

Performance Analysis and Comparison of JM 15.1 and Intel IPP H.264 Encoder and Decoder Performance Analysis and Comparison of 15.1 and H.264 Encoder and Decoder K.V.Suchethan Swaroop and K.R.Rao, IEEE Fellow Department of Electrical Engineering, University of Texas at Arlington Arlington,

More information

Research on the UHF RFID Channel Coding Technology based on Simulink

Research on the UHF RFID Channel Coding Technology based on Simulink Vol. 6, No. 7, 015 Research on the UHF RFID Channel Coding Technology based on Simulink Changzhi Wang Shanghai 0160, China Zhicai Shi* Shanghai 0160, China Dai Jian Shanghai 0160, China Li Meng Shanghai

More information

Video Authentication for H.264/AVC using Digital Signature Standard and Secure Hash Algorithm

Video Authentication for H.264/AVC using Digital Signature Standard and Secure Hash Algorithm Video Authentication for H.264/AVC using Digital Signature Standard and Secure Hash Algorithm Nandakishore Ramaswamy Qualcomm Inc 5775 Morehouse Dr, Sam Diego, CA 92122. USA nandakishore@qualcomm.com K.

More information

CHAPTER 3 Boolean Algebra and Digital Logic

CHAPTER 3 Boolean Algebra and Digital Logic CHAPTER 3 Boolean Algebra and Digital Logic 3.1 Introduction 121 3.2 Boolean Algebra 122 3.2.1 Boolean Expressions 123 3.2.2 Boolean Identities 124 3.2.3 Simplification of Boolean Expressions 126 3.2.4

More information

A Dynamic Link Allocation Router

A Dynamic Link Allocation Router A Dynamic Link Allocation Router Wei Song and Doug Edwards School of Computer Science, the University of Manchester Oxford Road, Manchester M13 9PL, UK {songw, doug}@cs.man.ac.uk Abstract The connection

More information

2695 P a g e. IV Semester M.Tech (DCN) SJCIT Chickballapur Karnataka India

2695 P a g e. IV Semester M.Tech (DCN) SJCIT Chickballapur Karnataka India Integrity Preservation and Privacy Protection for Digital Medical Images M.Krishna Rani Dr.S.Bhargavi IV Semester M.Tech (DCN) SJCIT Chickballapur Karnataka India Abstract- In medical treatments, the integrity

More information

Efficient Motion Estimation by Fast Three Step Search Algorithms

Efficient Motion Estimation by Fast Three Step Search Algorithms Efficient Motion Estimation by Fast Three Step Search Algorithms Namrata Verma 1, Tejeshwari Sahu 2, Pallavi Sahu 3 Assistant professor, Dept. of Electronics & Telecommunication Engineering, BIT Raipur,

More information

Asynchronous IC Interconnect Network Design and Implementation Using a Standard ASIC Flow

Asynchronous IC Interconnect Network Design and Implementation Using a Standard ASIC Flow Asynchronous IC Interconnect Network Design and Implementation Using a Standard ASIC Flow Bradley R. Quinton Dept. of Electrical and Computer Engineering University of British Columbia bradq@ece.ubc.ca

More information

A HIGH PERFORMANCE SOFTWARE IMPLEMENTATION OF MPEG AUDIO ENCODER. Figure 1. Basic structure of an encoder.

A HIGH PERFORMANCE SOFTWARE IMPLEMENTATION OF MPEG AUDIO ENCODER. Figure 1. Basic structure of an encoder. A HIGH PERFORMANCE SOFTWARE IMPLEMENTATION OF MPEG AUDIO ENCODER Manoj Kumar 1 Mohammad Zubair 1 1 IBM T.J. Watson Research Center, Yorktown Hgts, NY, USA ABSTRACT The MPEG/Audio is a standard for both

More information

1. True or False? A voltage level in the range 0 to 2 volts is interpreted as a binary 1.

1. True or False? A voltage level in the range 0 to 2 volts is interpreted as a binary 1. File: chap04, Chapter 04 1. True or False? A voltage level in the range 0 to 2 volts is interpreted as a binary 1. 2. True or False? A gate is a device that accepts a single input signal and produces one

More information

Functional-Repair-by-Transfer Regenerating Codes

Functional-Repair-by-Transfer Regenerating Codes Functional-Repair-by-Transfer Regenerating Codes Kenneth W Shum and Yuchong Hu Abstract In a distributed storage system a data file is distributed to several storage nodes such that the original file can

More information

Performance Comparison of an Algorithmic Current- Mode ADC Implemented using Different Current Comparators

Performance Comparison of an Algorithmic Current- Mode ADC Implemented using Different Current Comparators Performance Comparison of an Algorithmic Current- Mode ADC Implemented using Different Current Comparators Veepsa Bhatia Indira Gandhi Delhi Technical University for Women Delhi, India Neeta Pandey Delhi

More information

Innovative improvement of fundamental metrics including power dissipation and efficiency of the ALU system

Innovative improvement of fundamental metrics including power dissipation and efficiency of the ALU system Innovative improvement of fundamental metrics including power dissipation and efficiency of the ALU system Joseph LaBauve Department of Electrical and Computer Engineering University of Central Florida

More information

Tracking Moving Objects In Video Sequences Yiwei Wang, Robert E. Van Dyck, and John F. Doherty Department of Electrical Engineering The Pennsylvania State University University Park, PA16802 Abstract{Object

More information

Intra-Prediction Mode Decision for H.264 in Two Steps Song-Hak Ri, Joern Ostermann

Intra-Prediction Mode Decision for H.264 in Two Steps Song-Hak Ri, Joern Ostermann Intra-Prediction Mode Decision for H.264 in Two Steps Song-Hak Ri, Joern Ostermann Institut für Informationsverarbeitung, University of Hannover Appelstr 9a, D-30167 Hannover, Germany Abstract. Two fast

More information

Complexity-rate-distortion Evaluation of Video Encoding for Cloud Media Computing

Complexity-rate-distortion Evaluation of Video Encoding for Cloud Media Computing Complexity-rate-distortion Evaluation of Video Encoding for Cloud Media Computing Ming Yang, Jianfei Cai, Yonggang Wen and Chuan Heng Foh School of Computer Engineering, Nanyang Technological University,

More information

A Comparison of General Approaches to Multiprocessor Scheduling

A Comparison of General Approaches to Multiprocessor Scheduling A Comparison of General Approaches to Multiprocessor Scheduling Jing-Chiou Liou AT&T Laboratories Middletown, NJ 0778, USA jing@jolt.mt.att.com Michael A. Palis Department of Computer Science Rutgers University

More information

Image Compression through DCT and Huffman Coding Technique

Image Compression through DCT and Huffman Coding Technique International Journal of Current Engineering and Technology E-ISSN 2277 4106, P-ISSN 2347 5161 2015 INPRESSCO, All Rights Reserved Available at http://inpressco.com/category/ijcet Research Article Rahul

More information

DCT-JPEG Image Coding Based on GPU

DCT-JPEG Image Coding Based on GPU , pp. 293-302 http://dx.doi.org/10.14257/ijhit.2015.8.5.32 DCT-JPEG Image Coding Based on GPU Rongyang Shan 1, Chengyou Wang 1*, Wei Huang 2 and Xiao Zhou 1 1 School of Mechanical, Electrical and Information

More information

High Speed Error Detection and Data Recovery Architecture for Video Applications

High Speed Error Detection and Data Recovery Architecture for Video Applications RESEARCH ARTICLE OPEN ACCESS High Speed Error Detection and Data Recovery Architecture for Video Applications D. Kranthi Kumar, T. Mahaboob Doula P. G. Student scholar M. Tech (VLSI) Department of ECE

More information

Power Reduction Techniques in the SoC Clock Network. Clock Power

Power Reduction Techniques in the SoC Clock Network. Clock Power Power Reduction Techniques in the SoC Network Low Power Design for SoCs ASIC Tutorial SoC.1 Power Why clock power is important/large» Generally the signal with the highest frequency» Typically drives a

More information

Energiatehokas laskenta Ubi-sovelluksissa

Energiatehokas laskenta Ubi-sovelluksissa Energiatehokas laskenta Ubi-sovelluksissa Jarmo Takala Tampereen teknillinen yliopisto Tietokonetekniikan laitos email: jarmo.takala@tut.fi Energy-Efficiency Comparison: VGA 30 frames/s, 512kbit/s Software

More information

Time-Frequency Detection Algorithm of Network Traffic Anomalies

Time-Frequency Detection Algorithm of Network Traffic Anomalies 2012 International Conference on Innovation and Information Management (ICIIM 2012) IPCSIT vol. 36 (2012) (2012) IACSIT Press, Singapore Time-Frequency Detection Algorithm of Network Traffic Anomalies

More information

302 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 19, NO. 2, FEBRUARY 2009

302 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 19, NO. 2, FEBRUARY 2009 302 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 19, NO. 2, FEBRUARY 2009 Transactions Letters Fast Inter-Mode Decision in an H.264/AVC Encoder Using Mode and Lagrangian Cost Correlation

More information

A Direct Numerical Method for Observability Analysis

A Direct Numerical Method for Observability Analysis IEEE TRANSACTIONS ON POWER SYSTEMS, VOL 15, NO 2, MAY 2000 625 A Direct Numerical Method for Observability Analysis Bei Gou and Ali Abur, Senior Member, IEEE Abstract This paper presents an algebraic method

More information

JPEG compression of monochrome 2D-barcode images using DCT coefficient distributions

JPEG compression of monochrome 2D-barcode images using DCT coefficient distributions Edith Cowan University Research Online ECU Publications Pre. JPEG compression of monochrome D-barcode images using DCT coefficient distributions Keng Teong Tan Hong Kong Baptist University Douglas Chai

More information

International Journal of Electronics and Computer Science Engineering 1482

International Journal of Electronics and Computer Science Engineering 1482 International Journal of Electronics and Computer Science Engineering 1482 Available Online at www.ijecse.org ISSN- 2277-1956 Behavioral Analysis of Different ALU Architectures G.V.V.S.R.Krishna Assistant

More information

Implementation of Modified Booth Algorithm (Radix 4) and its Comparison with Booth Algorithm (Radix-2)

Implementation of Modified Booth Algorithm (Radix 4) and its Comparison with Booth Algorithm (Radix-2) Advance in Electronic and Electric Engineering. ISSN 2231-1297, Volume 3, Number 6 (2013), pp. 683-690 Research India Publications http://www.ripublication.com/aeee.htm Implementation of Modified Booth

More information

How To Improve Performance Of The H264 Video Codec On A Video Card With A Motion Estimation Algorithm

How To Improve Performance Of The H264 Video Codec On A Video Card With A Motion Estimation Algorithm Implementation of H.264 Video Codec for Block Matching Algorithms Vivek Sinha 1, Dr. K. S. Geetha 2 1 Student of Master of Technology, Communication Systems, Department of ECE, R.V. College of Engineering,

More information

Optimal Technology Mapping and Cell Merger for Asynchronous Threshold Networks

Optimal Technology Mapping and Cell Merger for Asynchronous Threshold Networks Optimal Technology Mapping and Cell Merger for Asynchronous Threshold Networks Cheoljoo Jeong Steven M. Nowick Department of Computer Science Columbia University Outline Introduction Background Technology

More information

Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning

Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning SAMSI 10 May 2013 Outline Introduction to NMF Applications Motivations NMF as a middle step

More information

Parallel AES Encryption with Modified Mix-columns For Many Core Processor Arrays M.S.Arun, V.Saminathan

Parallel AES Encryption with Modified Mix-columns For Many Core Processor Arrays M.S.Arun, V.Saminathan Parallel AES Encryption with Modified Mix-columns For Many Core Processor Arrays M.S.Arun, V.Saminathan Abstract AES is an encryption algorithm which can be easily implemented on fine grain many core systems.

More information

Final Year Project Progress Report. Frequency-Domain Adaptive Filtering. Myles Friel. Supervisor: Dr.Edward Jones

Final Year Project Progress Report. Frequency-Domain Adaptive Filtering. Myles Friel. Supervisor: Dr.Edward Jones Final Year Project Progress Report Frequency-Domain Adaptive Filtering Myles Friel 01510401 Supervisor: Dr.Edward Jones Abstract The Final Year Project is an important part of the final year of the Electronic

More information

Lossless Grey-scale Image Compression using Source Symbols Reduction and Huffman Coding

Lossless Grey-scale Image Compression using Source Symbols Reduction and Huffman Coding Lossless Grey-scale Image Compression using Source Symbols Reduction and Huffman Coding C. SARAVANAN cs@cc.nitdgp.ac.in Assistant Professor, Computer Centre, National Institute of Technology, Durgapur,WestBengal,

More information

Digital Systems Design! Lecture 1 - Introduction!!

Digital Systems Design! Lecture 1 - Introduction!! ECE 3401! Digital Systems Design! Lecture 1 - Introduction!! Course Basics Classes: Tu/Th 11-12:15, ITE 127 Instructor Mohammad Tehranipoor Office hours: T 1-2pm, or upon appointments @ ITE 441 Email:

More information

International Workshop on Field Programmable Logic and Applications, FPL '99

International Workshop on Field Programmable Logic and Applications, FPL '99 International Workshop on Field Programmable Logic and Applications, FPL '99 DRIVE: An Interpretive Simulation and Visualization Environment for Dynamically Reconægurable Systems? Kiran Bondalapati and

More information

Internet Video Streaming and Cloud-based Multimedia Applications. Outline

Internet Video Streaming and Cloud-based Multimedia Applications. Outline Internet Video Streaming and Cloud-based Multimedia Applications Yifeng He, yhe@ee.ryerson.ca Ling Guan, lguan@ee.ryerson.ca 1 Outline Internet video streaming Overview Video coding Approaches for video

More information

Multi-factor Authentication in Banking Sector

Multi-factor Authentication in Banking Sector Multi-factor Authentication in Banking Sector Tushar Bhivgade, Mithilesh Bhusari, Ajay Kuthe, Bhavna Jiddewar,Prof. Pooja Dubey Department of Computer Science & Engineering, Rajiv Gandhi College of Engineering

More information

High Speed Gate Level Synchronous Full Adder Designs

High Speed Gate Level Synchronous Full Adder Designs High Speed Gate Level Synchronous Full Adder Designs PADMANABHAN BALASUBRAMANIAN and NIKOS E. MASTORAKIS School of Computer Science, The University of Manchester, Oxford Road, Manchester M13 9PL, UNITED

More information

FPGA Design of Reconfigurable Binary Processor Using VLSI

FPGA Design of Reconfigurable Binary Processor Using VLSI ISSN (Online) : 2319-8753 ISSN (Print) : 2347-6710 International Journal of Innovative Research in Science, Engineering and Technology Volume 3, Special Issue 3, March 2014 2014 International Conference

More information

A comprehensive survey on various ETC techniques for secure Data transmission

A comprehensive survey on various ETC techniques for secure Data transmission A comprehensive survey on various ETC techniques for secure Data transmission Shaikh Nasreen 1, Prof. Suchita Wankhade 2 1, 2 Department of Computer Engineering 1, 2 Trinity College of Engineering and

More information

An Effective Deterministic BIST Scheme for Shifter/Accumulator Pairs in Datapaths

An Effective Deterministic BIST Scheme for Shifter/Accumulator Pairs in Datapaths An Effective Deterministic BIST Scheme for Shifter/Accumulator Pairs in Datapaths N. KRANITIS M. PSARAKIS D. GIZOPOULOS 2 A. PASCHALIS 3 Y. ZORIAN 4 Institute of Informatics & Telecommunications, NCSR

More information

Performance Oriented Management System for Reconfigurable Network Appliances

Performance Oriented Management System for Reconfigurable Network Appliances Performance Oriented Management System for Reconfigurable Network Appliances Hiroki Matsutani, Ryuji Wakikawa, Koshiro Mitsuya and Jun Murai Faculty of Environmental Information, Keio University Graduate

More information

Efficient Scheduling Of On-line Services in Cloud Computing Based on Task Migration

Efficient Scheduling Of On-line Services in Cloud Computing Based on Task Migration Efficient Scheduling Of On-line Services in Cloud Computing Based on Task Migration 1 Harish H G, 2 Dr. R Girisha 1 PG Student, 2 Professor, Department of CSE, PESCE Mandya (An Autonomous Institution under

More information

Let s put together a Manual Processor

Let s put together a Manual Processor Lecture 14 Let s put together a Manual Processor Hardware Lecture 14 Slide 1 The processor Inside every computer there is at least one processor which can take an instruction, some operands and produce

More information

FPGA area allocation for parallel C applications

FPGA area allocation for parallel C applications 1 FPGA area allocation for parallel C applications Vlad-Mihai Sima, Elena Moscu Panainte, Koen Bertels Computer Engineering Faculty of Electrical Engineering, Mathematics and Computer Science Delft University

More information

ON SUITABILITY OF FPGA BASED EVOLVABLE HARDWARE SYSTEMS TO INTEGRATE RECONFIGURABLE CIRCUITS WITH HOST PROCESSING UNIT

ON SUITABILITY OF FPGA BASED EVOLVABLE HARDWARE SYSTEMS TO INTEGRATE RECONFIGURABLE CIRCUITS WITH HOST PROCESSING UNIT 216 ON SUITABILITY OF FPGA BASED EVOLVABLE HARDWARE SYSTEMS TO INTEGRATE RECONFIGURABLE CIRCUITS WITH HOST PROCESSING UNIT *P.Nirmalkumar, **J.Raja Paul Perinbam, @S.Ravi and #B.Rajan *Research Scholar,

More information

Architectures and Platforms

Architectures and Platforms Hardware/Software Codesign Arch&Platf. - 1 Architectures and Platforms 1. Architecture Selection: The Basic Trade-Offs 2. General Purpose vs. Application-Specific Processors 3. Processor Specialisation

More information

Bandwidth Adaptation for MPEG-4 Video Streaming over the Internet

Bandwidth Adaptation for MPEG-4 Video Streaming over the Internet DICTA2002: Digital Image Computing Techniques and Applications, 21--22 January 2002, Melbourne, Australia Bandwidth Adaptation for MPEG-4 Video Streaming over the Internet K. Ramkishor James. P. Mammen

More information

Load Balancing Algorithm Based on Services

Load Balancing Algorithm Based on Services Journal of Information & Computational Science 10:11 (2013) 3305 3312 July 20, 2013 Available at http://www.joics.com Load Balancing Algorithm Based on Services Yufang Zhang a, Qinlei Wei a,, Ying Zhao

More information

RN-coding of Numbers: New Insights and Some Applications

RN-coding of Numbers: New Insights and Some Applications RN-coding of Numbers: New Insights and Some Applications Peter Kornerup Dept. of Mathematics and Computer Science SDU, Odense, Denmark & Jean-Michel Muller LIP/Arénaire (CRNS-ENS Lyon-INRIA-UCBL) Lyon,

More information

Architectural Level Power Consumption of Network on Chip. Presenter: YUAN Zheng

Architectural Level Power Consumption of Network on Chip. Presenter: YUAN Zheng Architectural Level Power Consumption of Network Presenter: YUAN Zheng Why Architectural Low Power Design? High-speed and large volume communication among different parts on a chip Problem: Power consumption

More information

SAD computation based on online arithmetic for motion. estimation

SAD computation based on online arithmetic for motion. estimation SAD computation based on online arithmetic for motion estimation J. Olivares a, J. Hormigo b, J. Villalba b, I. Benavides a and E. L. Zapata b a Dept. of Electrics and Electronics, University of Córdoba,

More information

Design and Implementation of an On-Chip timing based Permutation Network for Multiprocessor system on Chip

Design and Implementation of an On-Chip timing based Permutation Network for Multiprocessor system on Chip Design and Implementation of an On-Chip timing based Permutation Network for Multiprocessor system on Chip Ms Lavanya Thunuguntla 1, Saritha Sapa 2 1 Associate Professor, Department of ECE, HITAM, Telangana

More information

Detecting Multiple Selfish Attack Nodes Using Replica Allocation in Cognitive Radio Ad-Hoc Networks

Detecting Multiple Selfish Attack Nodes Using Replica Allocation in Cognitive Radio Ad-Hoc Networks Detecting Multiple Selfish Attack Nodes Using Replica Allocation in Cognitive Radio Ad-Hoc Networks Kiruthiga S PG student, Coimbatore Institute of Engineering and Technology Anna University, Chennai,

More information

Model-Based Synthesis of High- Speed Serial-Link Transmitter Designs

Model-Based Synthesis of High- Speed Serial-Link Transmitter Designs Model-Based Synthesis of High- Speed Serial-Link Transmitter Designs Ikchan Jang 1, Soyeon Joo 1, SoYoung Kim 1, Jintae Kim 2, 1 College of Information and Communication Engineering, Sungkyunkwan University,

More information

Module-I Lecture-I Introduction to Digital VLSI Design Flow

Module-I Lecture-I Introduction to Digital VLSI Design Flow Design Verification and Test of Digital VLSI Circuits NPTEL Video Course Module-I Lecture-I Introduction to Digital VLSI Design Flow Introduction The functionality of electronics equipments and gadgets

More information

VHDL Test Bench Tutorial

VHDL Test Bench Tutorial University of Pennsylvania Department of Electrical and Systems Engineering ESE171 - Digital Design Laboratory VHDL Test Bench Tutorial Purpose The goal of this tutorial is to demonstrate how to automate

More information

Enhancing High-Speed Telecommunications Networks with FEC

Enhancing High-Speed Telecommunications Networks with FEC White Paper Enhancing High-Speed Telecommunications Networks with FEC As the demand for high-bandwidth telecommunications channels increases, service providers and equipment manufacturers must deliver

More information

Binary Adders: Half Adders and Full Adders

Binary Adders: Half Adders and Full Adders Binary Adders: Half Adders and Full Adders In this set of slides, we present the two basic types of adders: 1. Half adders, and 2. Full adders. Each type of adder functions to add two binary bits. In order

More information

ADVANTAGES OF AV OVER IP. EMCORE Corporation

ADVANTAGES OF AV OVER IP. EMCORE Corporation ADVANTAGES OF AV OVER IP More organizations than ever before are looking for cost-effective ways to distribute large digital communications files. One of the best ways to achieve this is with an AV over

More information

Accelerating Wavelet-Based Video Coding on Graphics Hardware

Accelerating Wavelet-Based Video Coding on Graphics Hardware Wladimir J. van der Laan, Andrei C. Jalba, and Jos B.T.M. Roerdink. Accelerating Wavelet-Based Video Coding on Graphics Hardware using CUDA. In Proc. 6th International Symposium on Image and Signal Processing

More information

Matrix Multiplication

Matrix Multiplication Matrix Multiplication CPS343 Parallel and High Performance Computing Spring 2016 CPS343 (Parallel and HPC) Matrix Multiplication Spring 2016 1 / 32 Outline 1 Matrix operations Importance Dense and sparse

More information

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 13, NO. 10, OCTOBER 2003 1

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 13, NO. 10, OCTOBER 2003 1 TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 13, NO. 10, OCTOBER 2003 1 Optimal 3-D Coefficient Tree Structure for 3-D Wavelet Video Coding Chao He, Jianyu Dong, Member,, Yuan F. Zheng,

More information

Example-driven Interconnect Synthesis for Heterogeneous Coarse-Grain Reconfigurable Logic

Example-driven Interconnect Synthesis for Heterogeneous Coarse-Grain Reconfigurable Logic Example-driven Interconnect Synthesis for Heterogeneous Coarse-Grain Reconfigurable Logic Clifford Wolf, Johann Glaser, Florian Schupfer, Jan Haase, Christoph Grimm Computer Technology /99 Overview Ultra-Low-Power

More information

ADVANCED APPLICATIONS OF ELECTRICAL ENGINEERING

ADVANCED APPLICATIONS OF ELECTRICAL ENGINEERING Development of a Software Tool for Performance Evaluation of MIMO OFDM Alamouti using a didactical Approach as a Educational and Research support in Wireless Communications JOSE CORDOVA, REBECA ESTRADA

More information

On some Potential Research Contributions to the Multi-Core Enterprise

On some Potential Research Contributions to the Multi-Core Enterprise On some Potential Research Contributions to the Multi-Core Enterprise Oded Maler CNRS - VERIMAG Grenoble, France February 2009 Background This presentation is based on observations made in the Athole project

More information

*EP001025692B1* EP 1 025 692 B1 (19) (11) EP 1 025 692 B1 (12) EUROPEAN PATENT SPECIFICATION

*EP001025692B1* EP 1 025 692 B1 (19) (11) EP 1 025 692 B1 (12) EUROPEAN PATENT SPECIFICATION (19) Europäisches Patentamt European Patent Office Office européen des brevets *EP002692B1* (11) EP 1 02 692 B1 (12) EUROPEAN PATENT SPECIFICATION (4) Date of publication and mention of the grant of the

More information

COMBINATIONAL CIRCUITS

COMBINATIONAL CIRCUITS COMBINATIONAL CIRCUITS http://www.tutorialspoint.com/computer_logical_organization/combinational_circuits.htm Copyright tutorialspoint.com Combinational circuit is a circuit in which we combine the different

More information

Parallelized Architecture of Multiple Classifiers for Face Detection

Parallelized Architecture of Multiple Classifiers for Face Detection Parallelized Architecture of Multiple s for Face Detection Author(s) Name(s) Author Affiliation(s) E-mail Abstract This paper presents a parallelized architecture of multiple classifiers for face detection

More information

10 BIT s Current Mode Pipelined ADC

10 BIT s Current Mode Pipelined ADC 10 BIT s Current Mode Pipelined ADC K.BHARANI VLSI DEPARTMENT VIT UNIVERSITY VELLORE, INDIA kothareddybharani@yahoo.com P.JAYAKRISHNAN VLSI DEPARTMENT VIT UNIVERSITY VELLORE, INDIA pjayakrishnan@vit.ac.in

More information

A Second Undergraduate Course in Digital Logic Design: The Datapath+Controller-Based Approach

A Second Undergraduate Course in Digital Logic Design: The Datapath+Controller-Based Approach A Second Undergraduate Course in Digital Logic Design: The Datapath+Controller-Based Approach Mitchell A. Thornton 1 and Aaron S. Collins 2 Abstract A second undergraduate course in digital logic design

More information

A High-Yield Area-Power Efficient DWT Hardware for Implantable Neural Interface Applications

A High-Yield Area-Power Efficient DWT Hardware for Implantable Neural Interface Applications Proceedings of the 3rd International IEEE EMBS Conference on Neural Engineering Kohala Coast, Hawaii, USA, May 2-5, 2007 A High-Yield Area-Power Efficient DWT Hardware for Implantable Neural Interface

More information

A Robust and Lossless Information Embedding in Image Based on DCT and Scrambling Algorithms

A Robust and Lossless Information Embedding in Image Based on DCT and Scrambling Algorithms A Robust and Lossless Information Embedding in Image Based on DCT and Scrambling Algorithms Dr. Mohammad V. Malakooti Faculty and Head of Department of Computer Engineering, Islamic Azad University, UAE

More information

FPGA Implementation of an Extended Binary GCD Algorithm for Systolic Reduction of Rational Numbers

FPGA Implementation of an Extended Binary GCD Algorithm for Systolic Reduction of Rational Numbers FPGA Implementation of an Extended Binary GCD Algorithm for Systolic Reduction of Rational Numbers Bogdan Mătăsaru and Tudor Jebelean RISC-Linz, A 4040 Linz, Austria email: bmatasar@risc.uni-linz.ac.at

More information

Wavelet analysis. Wavelet requirements. Example signals. Stationary signal 2 Hz + 10 Hz + 20Hz. Zero mean, oscillatory (wave) Fast decay (let)

Wavelet analysis. Wavelet requirements. Example signals. Stationary signal 2 Hz + 10 Hz + 20Hz. Zero mean, oscillatory (wave) Fast decay (let) Wavelet analysis In the case of Fourier series, the orthonormal basis is generated by integral dilation of a single function e jx Every 2π-periodic square-integrable function is generated by a superposition

More information

LOW POWER MULTIPLEXER BASED FULL ADDER USING PASS TRANSISTOR LOGIC

LOW POWER MULTIPLEXER BASED FULL ADDER USING PASS TRANSISTOR LOGIC LOW POWER MULTIPLEXER BASED FULL ADDER USING PASS TRANSISTOR LOGIC B. Dilli kumar 1, K. Charan kumar 1, M. Bharathi 2 Abstract- The efficiency of a system mainly depends on the performance of the internal

More information

Multipliers. Introduction

Multipliers. Introduction Multipliers Introduction Multipliers play an important role in today s digital signal processing and various other applications. With advances in technology, many researchers have tried and are trying

More information

Video-Rate Stereo Vision on a Reconfigurable Hardware. Ahmad Darabiha Department of Electrical and Computer Engineering University of Toronto

Video-Rate Stereo Vision on a Reconfigurable Hardware. Ahmad Darabiha Department of Electrical and Computer Engineering University of Toronto Video-Rate Stereo Vision on a Reconfigurable Hardware Ahmad Darabiha Department of Electrical and Computer Engineering University of Toronto Introduction What is Stereo Vision? The ability of finding the

More information

INTRODUCTION TO DIGITAL SYSTEMS. IMPLEMENTATION: MODULES (ICs) AND NETWORKS IMPLEMENTATION OF ALGORITHMS IN HARDWARE

INTRODUCTION TO DIGITAL SYSTEMS. IMPLEMENTATION: MODULES (ICs) AND NETWORKS IMPLEMENTATION OF ALGORITHMS IN HARDWARE INTRODUCTION TO DIGITAL SYSTEMS 1 DESCRIPTION AND DESIGN OF DIGITAL SYSTEMS FORMAL BASIS: SWITCHING ALGEBRA IMPLEMENTATION: MODULES (ICs) AND NETWORKS IMPLEMENTATION OF ALGORITHMS IN HARDWARE COURSE EMPHASIS:

More information

Low-Power Error Correction for Mobile Storage

Low-Power Error Correction for Mobile Storage Low-Power Error Correction for Mobile Storage Jeff Yang Principle Engineer Silicon Motion 1 Power Consumption The ECC engine will consume a great percentage of power in the controller Both RAID and LDPC

More information