All Digital imig Recovery ad FPGA Implemetatio Daiel Cárdeas, Germá Arévalo Abstract Clock ad data recovery CDR is a importat subsystem of every commuicatio device sice the receiver must recover the exact trasmitter s clock iformatio usually coded ito the icomig stream. ome aalogue techiques for CDR have bee developed based o PLL theory employig a exteral VCO. However, sometimes exteral compoets could be cumbersome whe iterfacig them with the digital core (FPGA, DP) already preset i the device. hus, the digital core is also used to carry out the timig recovery task by all-digital techiques i.e. without a exteral VCO. his article will describe a all digital timig recovery subsystem usig digital techiques implemeted o a FPGA Idex erms Clock ad Data Recovery CDR, FPGA, DP, ychroizatio, imig Recovery. I. INRODUCION Clock ad Data Recovery is a key elemet of a commuicatio s receiver. Depedig o the characteristics of the trasceiver ad the whole commuicatio system, differet approaches ca be take i order to recover the right clock ad data iformatio from the icomig data. For digital systems, the traditioal approach uses a aalog Voltage Cotrolled Oscillator (VCO) who drives the receiver s sampler. Aother approach uses digital sigal processig i order to recover the right data iformatio, thus o VCO is eeded. his article will briefly describe the latter oe ad will show, as a example, a hardware implemetatio o a Field Programmable Gate Array. II. IMING RECOVERY DECRIPION Figure 1 shows a typical basebad PAM commuicatio system where iformatio bits b k are applied to a lie ecoder which coverts them ito a sequece of symbols a k. his sequece eters the trasmit filter G (ω) ad the is set through the chael C(ω) which distorts the trasmitted sigal ad adds oise. At the receiver, the sigal is filtered by G R (ω) D. Cárdeas is with Electroics Departmet, Col. Politécico, Uiversidad a Fracisco de Quito, M103A, Cumbayá, Ecuador (e-mail: dcardeas@usfq.edu.ec). G. Arévalo is with AI echologies Ltd.- Desig Ceter, a Rafael, Ecuador. D. Cárdeas thaks the colleagues at PhotoLab, Istituto uperiore Mario Boella, orio, Italy, for their ecoomical ad techical support. i order to reject the oise compoets outside the sigal badwidth ad reduced the effect of the II. he sigal at the output of the receiver filter is y ( t; ε ) = am g( t m ε ) + ( t) m Equatio 1 where g(t) is the basebad pulse give by the overall trasfer fuctio G(ω) (Equatio 2), (t) is the additive oise, is the symbol period (trasmitter) ad ε is the fractioal time delay (ukow) betwee the trasmitter ad the receiver, ε < ½. he symbols â k are estimated based upo these samples. hey are fially decoded to give the sequece of bits b k. b k Lie ecoder G(ω)= G (ω)c(ω)g R (ω) Equatio 2. Overall trasfer fuctio a k G (ω) C(ω) G R (ω) oise Fig. 1. Basic Commuicatio ystem for basebad PAM he receiver does ot kow a priori the optimum samplig istats {k+ ε}. herefore, the receiver must icorporate a timig recovery circuit or clock or symbol sychroizer which estimates the fractioal delay ε from the received sigal. wo mai categories of clock sychroizers are the distiguished depedig o their operatig priciple: error trackig (feedback) ad feedforward sychroizers [1]. A. Feedforward ychroizer Figure 2 shows the basic architecture of the feedforward sychroizer. Its mai compoet is the timig detector which computes directly the istataeous value of the fractioal delay ε from the icomig data. he oisy measuremets are averaged to yield the estimate ad set as cotrol sigal to a referece sigal geerator. he geerated clock is fially used by the data sampler [1, 2]. y(t,ε)
rasceiver i these proceedigs for a aalysis of a hybrid sychroizer. B. Feedback ychroizer Fig. 2. Feedforward (ope-loop) ychroizer he mai compoet of the feedback sychroizer is the timig error detector, which compares the icomig PAM data with the referece sigal, as show i Figure 3. Its output gives the sig ad magitude of the timig error e = ε ˆ ε. he filtered timig error is used to cotrol the data sampler. Hece, feedback sychroizers use the same priciple tha a classical PLL [1, 2]. A. All-digital architecture Figure 4 shows the architecture of a all-digital timig recovery. he A/D coverter operates with a free ruig oscillator that has a omial frequecy idetical to the D/A used at the trasmitter. However, the ratio betwee the realworld symbol rate ad the idepedet (fixed rate) samplig clock is ever ratioal (ad it will chage i time) i.e. samplig frequecy ad baud rate are icommesurate; therefore, samplig is asychroous with the icomig data. Fig. 4 Digital Architecture Fig. 3. Feedback (closed loop) ychroizer he mai differece betwee these two sychroizer implemetatios is ow evidet. he feedback sychroizer miimizes the timig error sigal, the referece sigal is used to correct itself thaks to the closed loop; the feedforward sychroizer estimates directly the timig from the icomig data ad geerates directly the referece sigal, o feedback is eeded. Besides the previous classificatio, some others ca be made. If the sychroizer uses the receiver s decisios about the trasmitted data symbols to estimate the timig, the sychroizer is said to be decisio directed, otherwise is odata aided. he sychroizer ca also work i cotiuous or discrete time. III. CDR HARDWARE ARCHIECURE We aalyzed two approaches: the hybrid sychroizer, which is partially implemeted o the digital domai ad partially o the aalogue domai; ad the digital sychroizer, which fully operates i discrete time. Eve if their hardware implemetatio is differet, both have a equivalet architecture of a feedback sychroizer; therefore, the theory for computig the loop parameters is exactly the same. I the followig we describe the digital sychroizer, please refer to the article Clock ad Data Recovery for a High peed ice the samplig clock is a free ruig clock, data sychroizatio occurs by meas of time-varyig data iterpolatio i order to create the samples that would have bee obtaied if the origial samplig had bee sychroized with the symbols. After the iterpolator, data are set to the timig error detector ad the to the loop filter. he filtered error sigal cotrols a NCO which closes the loop. he NCO s outputs give the correct parameters for iterpolatio. his architecture will be described i more detail i the followig sectio. IV. DIGIAL IMING RECOVERY ARCHIECURE he digital timig recovery architecture is better depicted i Figure 5. Let s be the asychroous samplig period of the A/D coverter icommesurate with the icomig symbol period. We might ote that eve the slightest differece betwee the trasmitter ad receiver clocks might result i cycle slips after some time. 1/s μ m s ED imig processor Loop filter Figure 5 Digital imig Recovery feedback sychroizer
We must obtai samples y(+εˆ ), with iteger, at symbol rate 1/ from samples take at 1/s. herefore, the trasmitter time scale (defied by ) must be expressed i terms of the receiver time scale (defied by s). Estimatio of the fractioal time delay ε is the first importat operatio i all-digital timig recovery. + ˆ ε = s + ˆ ε s L + ˆ ε + ˆ it µ s s = y + ˆ ε = y m + ˆ µ hece, ( ) ( ) Equatio 3 where m = L it (x) returs the largest iteger less tha or equal to x, ad û is the differece betwee oe samplig istat at the receiver ad the correspodig optimum sample i trasmissio; the idex m is called basepoit ad the value û is the estimatio of the fractioal delay. hese cocepts are better illustrated i Figure 6. iterpolatio, decimatio cotrol ad fractioal delay estimatio as well as their adopted implemetatio. A. imig Error Detector (ED) he timig error detector ED resembles the operatio of a Phase Detector i a aalogue PLL, i.e. it gives the error iformatio based o the phase differece betwee the icomig sigal ad the referece clock at its iput. here are several algorithms to implemet digitally a timig error detector depedig o the oversamplig factor or modulatio format [1, 3, 4]. he available hardware for this prototype, specifically the ADC, allows to have at most two samples/symbol; moreover, it would be better if its implemetatio has a good trade-off betwee complexity ad performace, hece, we selected the error detector from Garder [5, 6]. For a descriptio of this ED ad others please refer to the article Clock ad Data Recovery for a High peed rasceiver i these proceedigs. B. Digital he task of the iterpolator is to compute the optimum samples y(+εˆ ) from a set of received samples x(m ) as stated before by Equatio 3. Figure 7 shows that the iterpolatio is basically a time varyig filterig process sice ad s are icommesurate. μ o μ 1 μ 2 x(ms) a) X 1/s μ m s y() m o m 1 m 2 b) RX Figure 6 ime cale of the a) rasmitter ad b) Receiver Figure 6 ad Equatio 3 show that the correct sample at istat +ε ca be iterpolated from a set of samples defied by the basepoit m ad the estimated fractioal differece û betwee that basepoit ad the ew sample to be computed. Note that the time shift û is time variable despite ε is costat. Equatio 3 is the most importat oe i all-digital timig recovery. he timig parameters (û,m ) are calculated oce the fractioal time delay ε has bee estimated. he secod most importat fuctio i all-digital timig recovery comprises two operatios: decimatio, give by the basepoit idex, ad iterpolatio give by the fractioal delay; the values of the basepoit ad the fractioal delay are computed by the timig estimator block i Figure 5. he time-varyig iterpolator uses these values to compute the optimum sample. he followig sectios will explai i more detail the Fig. 7. Digital iterpolator filter he iterpolatig filter has a ideal impulse respose of the form of the sampled si(x) give by Equatio 4. It ca be thought as a FIR filter with ifiite taps whose values deped o the fractioal delay μ. Figure 8 shows the respose of the digital filter whe μ=0.2, ote how the respose varies from the uderlyig cotiuous respose cetered at zero. h I µ π ( µ ) h (, µ ) = si ( + ) = Equatio 4. Impulse respose of the ideal iterpolator
h (s,µs) cotiuous µ = 0.2, discrete fractioal delay μ k have to be computed i order to obtai the iterpolated sample y. Equatio 3 becomes ow: y(k I + ε I ) = y( L it [k I + ε I I ] +μ k ) = y( m k + μ k ) Equatio 7 0-4 -3-2 -1 0 1 2 3 4 s Fig. 8. Impulse respose of the ideal iterpolatig filter For a practical implemetatio, the iterpolator ca be approximated by a fiite order FIR filter (equatio 5). Figure 9 shows a detailed diagram of the full architecture of digital timig recovery. Note that the loop filter decimates the output of the ED (at s) before the filterig process, so that the filtered error sigal is updated at symbol rate. his is a cotrolled decimatio by M I basepoit m k (k=m I ), slaved to the first oe at the iterpolator. Apart from that, sice the iterpolator works with M I samples per icomig symbol (omially), oly oe of them should be passed to the rest of the subsystem as valid recovered data; this meas aother decimatio process (by M I ) also slaved to the oe of the iterpolator. herefore, the output of the digital timig recovery block is the updated at symbol rate. H I jω (, ) = 2 e µ h ( µ ) = I1 e jω Output y(m M Is+μ M Is) Equatio 5 where I 1 ad I 2 defie respectively the lowest ad the highest samples of the discrete impulse respose aroud the cetral poit. he output of the filter is give by a liear combiatio of the (I 2 + I 1 +1) sigal samples take aroud the basepoit m k. his leads to Equatio 6, which is the fudametal equatio i digital iterpolatio. I order to obtai a uique basepoit set, there must be a eve umber of samples ad iterpolatio should be performed i the ceter iterval. 2 ( mk + µ ) = x[ ( mk ) ] I y h ( µ ) = I1 1/s μ k m ks ED imig processor Loop filter w decimate at m k slave decimatio by M I Equatio 6. Digital iterpolatio As metioed before, the coefficiets h (μ) of the filter are ot fixed ad they vary accordigly to μ. his requires limitig the possible values of μ ad, for each of them oe should precompute ad store i memory the respective coefficiets. he problems arise from the discretizatio error i μ, ad from the large complexity i a real implemetatio i hardware. A commo solutio is to approximate each coefficiet by a polyomial i μ. C. Cotrol ad NCO he iterpolator cotrol block computes the basepoit m ad the fractioal delay û based o the filtered timig error. Error detectors produce a error sigal at 1/ usig samples k I = k/m I with M I iteger (M I = 2 samples/symbol i this case). herefore, for every sample, the basepoit m k ad the Fig. 9. Detailed architecture of the all-digital timig recovery he filtered error sigal w costitutes the cotrol word of the timig processor which computes the basepoit ad the fractioal delay. As see, the timig processor cotrols every block i the digital timig recovery subsystem at every cycle ks. Its tasks are summarized i the followig. It computes the basepoit m k or decimatio, this ivolves that the timig processor selects the correct samples that go through the iterpolator. If the receiver clock is faster tha the icomig sample rate, at some poit oe extra sample is take by the ADC; the timig processor does ot pass the extra sample to the iterpolator sice it is ot useful. Note that also the rest if the blocks i the subsystem must ot operate with this sample either. O the other had, if the receiver
clock is slower tha the icomig data rate, at some poit oe sample is lost; i this case, there s o decimatio for the iterpolator, all the samples are valid. ice the system works with Mx oversamplig (M samples/symbol), the slaved decimatios are ot affected, they are always takig just oe every M samples. he timig processor computes also the fractioal delay μ k, so it selects the correct impulse respose of the iterpolator. he timig processor ca be carried out by a NCO [7]. he NCO register is computed iteratively as: (m k ) = [ (m k -1) + w (m k -1) ] mod 1 Equatio 8 Ad the fractioal delay is estimated by: μ k I / (m k ) Equatio 9 he prototype works with two samples/symbol at the trasmitter ad also at the receiver, so omially I / = 1. herefore, the cotet of the NCO is the fractioal delay. Istead of explicitly computig m k, overflow ad uderflow of the NCO register idicate if the receiver clock is faster or slower, respectively. D. Loop Filter he timig processor based i a NCO allows cosiderig the digital timig recovery as a equivalet PLL operatig at symbol frequecy. herefore, the loop aalysis ca be carried out usig classical PLL theory [8]. he same cosideratio applies for the hybrid CDR desig described i the article Clock ad Data Recovery for a High peed rasceiver i these proceedigs, please refer to it for a better theoretical descriptio. he aalogue loop filter F(s) must be trasformed to the digital domai F(z) i order to be implemeted i the FPGA. he desig cosiders the biliear trasformatio, which maps the etire left side of the s-plae ito the etire uit circle of the z-plae. o, ay stable trasform i the cotiuous domai s is mapped ito a stable z-trasform i the discrete time. he biliear trasformatio is achieved by the followig equatio. F 2 1 z s 1+ z 1 ( s) F( z), s = 1 Equatio 10 where s is the sample period. V. FPGA IMPLEMENAION AND REUL Figure 10 depicts the all-digital solutio implemeted. he system has successfully simulated ad a (first) stad-aloe test has bee carried out. Icomig 8-PAM sigal ADC ks (fixed) x k 2 samples/symbol (Parabolic) µ FA LOW NCO & Cotrol y ki select 1 sample Fig. 10. All digital timig recovery architecture ED (Garder) Loop Filter Note that the iterfacig of this all-digital module with the rest of the subsystems meas hadlig the cotrolled decimatio of iterpolated samples. he all-digital timig recovery block works with two samples per symbol at the iput ad, as the theory says, it should perform decimatio by 2 i order to output data at symbol rate. I this case however, the equalizer that follows at its output requires also two samples (it acts at sample rate); therefore, o slaved decimatio by 2 shall be doe. A problem arises whe the receiver clock is slower tha the trasmitted oe; usually the oly sample obtaied is passed to the output, but i this case two samples should be created ad passed to the ext stage i a sigle clock cycle. he all digital timig recovery was successfully tested i simulatios with fiite arithmetic ad a similar result was observed i the first stad-aloe tests that have bee carried out i the FPGA. Figure 11 illustrates the situatio whe the receiver clock is faster tha the trasmitted oe; i this case a flag (FLAG RX FA) idicates this situatio ad cotrol the rest of the blocks i the structure i order to ot cosider it for the computatios. he lower part of the figure shows the icremet of the fractioal delay i time. It shows that the flag is activated whe the fractioal delay recycles from the maximum (1) to the miimum value (0). o postequalizer F(s)=Kp+Ki/s FPGA
Boolea output 1 0 1 0 Flag RX FA 360.01 384.02 408.02 432.02 456.02 480.02 Fractioal delay (u) 360.01 384.02 408.02 432.02 456.02 480.02 ime scale at receiver [µs] Fig. 11. All digital timig recovery, receiver clock is faster tha trasmitter clock, flag idicates that oe sample should ot be cosidered. Nomial receiver clock frequecy= 83.33 MHz. Eve though a simulatio display is show here, the same behaviour has bee observed i a digital oscilloscope i a real test with the first hardware versio of the system. Nevertheless, the iterfacig of this block with the rest of the subsystems of the receiver is still uder desig. VI. ACKNOWLEDGEMEN he authors would like to ackowledge the ecoomical support of, ad the fruitful discussios with. the PhotoLab taff at IMB, orio, Italy. VIII. BIOGRAPHIE Daiel Cárdeas received the Eg. degree i telecommuicatios egieerig from Escuela Politécica Nacioal, Quito, Ecuador, ad the M.c. (optical commuicatios) ad Ph.D. (electroics egieerig) from Politecico di orio, orio, Italy i 2004 ad 2008, respectively. He is a coauthor of a patet i the field of telecommuicatios over plastic optical fibers. He collaborated with Istituto uperiore Mario Boella, orio, Italy ad he was a exteral cosultat for Fracecol, Frace, ad iemes, Germay. He is curretly with UFQ, Ecuador. His curret research activities iclude polymer-optical-fiber-based access systems, ad digital sigal processig techiques ad their implemetatio o FPGA/DP. Germá Arévalo received the Eg. degree i telecommuicatios egieerig from Escuela Politécica Nacioal, Quito, Ecuador, ad the M.c. i optical commuicatios ad photoic techologies from Politecico di orio, orio, Italy i 2004. He is ow the dea of the electroics faculty, Uiversidad Politécica alesiaa, Quito, Ecuador ad collaborates with the desig ceter of AI echologies Ltd., Ecuador. VII. REFERENCE [1] Meyr Heirich, Marc Moeeclaey, ad tefa A. Fechtel, Digital Commuicatios Receivers: sychroizatio, chael estimatio ad sigal processig. Vol. 2. 1998: Joh Wiley & os. [2] U.Megali, A.N.D'Adrea, ychroizatio echiques for Digital Receivers. 1997, New York: Pleum Press. [3] K. H. Mueller ad M. Müller, imig Recovery i Digital ychroous Data Receivers. IEEE ras. Commu., May 1976. COM-24: p. 516-531. [4] M. Oerder ad H. Meyr, Digital Filter ad quare imig Recovery. IEEE ras. Commu, May 1988. COM-36: p. 605-612. [5] Garder, F.M., A BPK/QPK timig-error detector for sampled receivers. IEEE rasactios o Commuicatios, 1986. CM-34(5): p. 423. [6]. M. Oerder ad H. Meyr, Derivatio of Garder s imig Error Detector from the Maximum Likelihood Priciple. IEEE ras. Commu., Jue 1987. COM-35: p. 684-685. [7] Garder, F.M., Iterpolatio i digital modems. I. Fudametals. IEEE rasactios o Commuicatios, 1993. 41(3): p. 501. [8] Garder, F.M., Phaselock echiques. hird ed. 2005, New York: Wiley.