JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.0, NO., MARCH, 00 49 Fst Circuit Simultio Bsed o Prllel-Distriuted LIM usig Cloud Computig System Yut Ioue, Tdtoshi Sekie, Tkhiro Hsegw d Hideki Asi Astrct This pper descries fst circuit simultio techique usig the ltecy isertio method (LIM) with prllel d distriuted lepfrog lgorithm. The umericl simultio results o the PC cluster system tht uses the cloud computig system re show. As result, it is cofirmed tht our method is very useful d prcticl. the origil circuit is prtitioed ito severl computtiol domis. The updtig clcultios i ech domi re performed cocurretly. I this cse, the umer of the domis is exctly equl to the umer of processig elemets (PEs). I this reserch, ple circuit, which is frequetly give s the model of power distriutio etworks, is lyzed y prllel-distriuted LIM. Idex Terms Ltecy isertio method, fst circuit simultio, prllel computig, cloud computig system I. INTRODUCTION I these yers, the high-speed d high-desity electroic circuit desigs hve ee required for the ltest chips, pckges d ords. With the progress of itegrtio techology, vriety of sigl d power itegrity prolems hve ecome serious d importt. Thus, for the efficiet desigs, vriety of dvced simultio techiques hve ee required to clrify the vrious effects of the high-speed sigl ehviors. LIM hs ee proctively proposed s oe of the fst trsiet simultio methods pplicle to lrge etworks [-5]. The lgorithm of LIM is logous to the relxtio-sed oe which does ot eed mtrix opertios d it seems tht this is suitle for the prllel implemettio. We hve lredy give prllel-distriuted lepfrog lgorithm [3, 4] sed o the LIM y usig MPI [6]. This pper shows the ovel simultio results performed y the clustered cloud computig system [7, 8] with the sixtee clcultio istces. I our pproch, Muscript received Oct., 009; revised Dec. 7, 009. Deprtmet of Iformtio Sciece d Techology, Shizuok Uiversity Hmmtsu, Jp E-mil : ioue@tzsi7.sys.eg.shizuok.c.jp II. LATENCY INSERTION METHOD LIM is oe of the circuit simultio methods sed o the lepfrog lgorithm for the fst trsiet lysis. Ulike the covetiol SPICE-like simultors which require the time-cosumig LU decompositio of lrge scle coefficiet mtrices, the LIM lgorithm does ot eed directly the mtrix opertios. I fct, ecuse of its lierly-icresig chrcteristic of the clcultio mout of the LIM lgorithm, LIM-sed simultio is much fster th the covetiol methods for lrgescle etworks [-5]. The LIM lgorithm requires the circuit to e lyzed to e composed of the comitio of the certi type of the topology, mely the rch d ode topologies. The rch topology is show i Fig. (), d the ode v R L E i () v i G i i 3 ik C v () H Fig.. Required lier circuit topologies for LIM lgorithm. () Lier rch topology for LIM. () Lier ode topology for LIM.
50 YUTA INOUE et l : FAST CIRCUIT SIMULATION BASED ON PARALLEL-DISTRIBUTED LIM USING~ topology is show i Fig. (). The rch must cosist of the series coected resistce R, iductce L d idepedet voltge source E, d they re coected etwee ritrry odes d i the etwork. Similrly, ech ode i the circuit must cosist of the prllel coected coductce G, cpcitce C d idepedet curret source H d they re respectively coected etwee ritrry ode d the referece ode, i.e. groud. Tht is to sy, topology of the etwork hs to e stisfied with the followig requiremets: Ech rch i the etwork must coti iductce d ech ode i the etwork must coect cpcitce to groud. Otherwise, reltively smll iductor or shut cpcitor is iserted ito the correspodig rch or ode to geerte ltecy, respectively. Thus, i order to geerte the updtig formuls of LIM for lier etwork, pplyig the Kirchhoff s voltge lw (KVL) to the rch d the Kirchhoff s curret lw (KCL) to the ode with the fiite differece method leds to i i v v = R i + L E Δt () M + v v i k = Gv + C H () k= Δt where is the time step, Δt is the time step size d M is the umer of the rches coected to the ode. Note tht the time steps of the rch curret d the ode voltge re collocted i hlf time step, which is similr to the lgorithm i the FDTD (Fiite Differece Time Domi) method for the electromgetic simultio. The, solvig () for the rch curret for the ode voltge updtig formuls. i d () v leds to the followig Δ + Δ tr L t = + i i v v + E L L (3) v = Δ + M C t v i ΔtG + C ΔtG + C k = k + H (4) (/)-th time step, ech vrile is updted oly y sustitutig the vlues t the pssed time steps. Therefore, they re updted ltertely d explicitly s the time progress. III. PARALLEL-DISTRIBUTED LIM As descried ove, ech curret d voltge vrile is updted idividully i the LIM lgorithm, d therey the curret d voltge updtig processes c e esily performed i prllel. I other words, i the cse tht rch currets re updted t ritrry time poit, ech rch curret is updted itself without y other vriles t the sme time step d c refer the vriles t the pst time poits explicitly. The sme procedure is lso doe i the cse of updtig the voltge. Thus, the clcultios for updtig re decoupled ech other, d therefore, they c e performed i prllel completely. Here, the procedure of the prllel-distriuted LIM is descried for the ple circuit which cosists of pssive, lier d time-ivrit compoets s show i Fig.. The power/groud ple i prited circuit ord is modeled s the equivlet circuit d its topology is suitle for the LIM lgorithm [3, 4]. I Fig., it is ssumed tht the umer of processig elemets (PEs) is two d the ple circuit is divided ito two domis log the iterfce ode c, h, m d r. The, oe PE, med PE, holds the vlues of the rch currets i,c, i g,h, i l,m, i q,r d the other currets d voltges i the left hlf ple. Ad other oe, med PE, holds i c,d, i h,i, i m,, i r,s d the other vriles i the right hlf ple. Note tht the vlues of the iterfce ode voltges v c, v h, p k q f i c c d v c g h i l PE r m s t o j e PE Sice ll terms i the right hd sides of the updtig formuls (3) d (4) c e give t the ()-th or the Fig.. Prtitioig of ple circuit.
JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.0, NO., MARCH, 00 5 Strt T = 0 Updte rch curret vlues of oudry Curret Source Commuicte oudry rch curret vlues Updte rch curret vlues except oudry Updte ode voltge vlues except the iterfce ode Wit for completio of dt commuictio Updte the iterfce ode voltge vlues T > T mx Ed Yes v m d v r re held y oth PE d PE, d the updtig clcultios for these voltges re processed y oth PEs. I the prllel-distriuted LIM, ech PE updtes oly the vriles which ech PE holds. Fig. 3 shows the lgorithm of prllel-distriuted LIM. I the origil LIM, the rch currets d the ode voltges re ltertely updted i ech time step. O the other hd, i the prllel-distriuted LIM, first the rch curret vlues of oudry re updted. Secod, the oudry rch curret vlues re commuicted with eighorig PEs. The rch curret vlues d the ode voltge vlues except the oudry prt re clculted i ech domi durig dt commuictio. Ech PE hs to wit for completio of dt commuictio. Filly, the iterfce ode voltges re updted. IV. NUMERICAL RESULTS I order to verify the vlidity of the origil LIM d the prllel-distriuted LIM, some exmple circuits were simulted. Fig. 4 shows exmple ple equivlet circuit. I ll of the simultios, the wveform with No Fig. 3. Flowchrt of prllel-distriuted LIM. Fig. 4. A exmple ple equivlet circuit. Oservtio Poit dely of 0. sec, risig time of 0. sec, pulse width of.0 sec, d mgitude of 0.05 A ws used s the iput curret. First, the simultio results (trsiet resposes) of the ple equivlet circuit composed of 400 uit cells re illustrted i Fig. 5 d Tle shows the executio times y HSPICE d the LIM. The simultio hs ee doe o Sprcv9 GHz. The wveform results, Fig. 5, show the good greemet etwee the LIM d HSPICE. From Tle, it c e see tht the LIM is out 60 times fster th HSPICE i the cse of 0,000 uit cells. Next, i order to demostrte the performce of the prllel-distriuted LIM, we simulted trsiet resposes of some ple circuits, which re modeled y,000,000, Voltge (V) 0.06 0.04 0.0 0 0 4 [ 0 9 ] Time (sec) LIM HSPICE Fig. 5. Trsiet simultio result of the etwork composed of 400 Uit Cells. Tle. Comprig executio times y HSPICE d LIM Numer of Cells Executio time (secods) HSPICE LIM 400 4.68 0.39 0,000 935.88 5.78
5 YUTA INOUE et l : FAST CIRCUIT SIMULATION BASED ON PARALLEL-DISTRIBUTED LIM USING~ 4,000,000 d 9,000,000 uit cells. We cofirmed the performce of clusterig computer etwork system hvig two istces. A clusterig computer etwork system is costructed y the cloud computig system provided y Amzo EC service [7]. The performce of two istces which correspod to PCs is compred to the sigle PC cse. Ech clcultio istce hs two CPUs, ech of which is composed of qud core. I dditio, ech process is performed y ech core. Thus, the 6 cores re ville s the mximum performce. Fig. 6 shows the reltioship etwee the speed-up rtio d the umer of processes for three kids of etwork models uder the coditio tht the umer of the time steps ws,000. I the cse of the cloud computig system, the speed-up rtio is sturted roud the 6 processes. We lso performed SGI Altix4700 uder the sme coditio. This high performce computer system is composed of sixtee CPUs, ech of which is Itium.6 GHz. I dditio, ech process is performed y ech CPU. Tle shows the computer eviromets of the SGI Altix4700 d the cloud computig system. I the cse of SGI Altix4700, the speed-up rtio for ll models is mootoiclly icresig. We lso tested the performce of cloud computig system y usig sixtee istces, mely 3 CPUs. Speed-up rtio 0 8 6 4 0 8 6 4 0,000,000 uit cells (Cloud Computig System) 4,000,000 uit cells (Cloud Computig System) 9,000,000 uit cells (Cloud Computig System),000,000 uit cells (Altix4700) 4,000,000 uit cells (Altix4700) 9,000,000 uit cells (Altix4700) 4 6 8 0 4 6 Numer of processes Fig. 6. Speed up rtio compriso of cloud computig system with Altix4700. Executio time(sec) 000 00 0 8 6 4 3 40 48 56 64 Numer of processes Fig. 7. Executio time vs # of process. Speed Up Rtio 35 30 5 0 5 0 5 0,000,000 Uit Cells 4,000,000 Uit Cells 9,000,000 Uit Cells Fig. 8. Speed-up rtio.,000,000 Uit Cells 4,000,000 Uit Cells 9,000,000 Uit Cells 8 6 4 3 40 48 56 64 Numer of processes Thus, 8 cores re ville s the mximum performce. Fig. 7 shows the reltioship etwee the executio time d the umer of processes. The executio time mootoiclly decreses util roud 3 processes. Fig. 8 shows the reltioship etwee the speed-up rtio d the umer of processes. The speedup rtio mootoiclly icreses util roud 3 processes. These figures clerly show tht the executio time of 3 processes is roud 5 times fster th the executio time of process. Although the executio time decresed util roud 3 processes, the executio time does ot decrese i the rge of over 3 processes. Tht is to sy, the speed-up rtio is sturted y the ottle eck of dt trsfer etwee CPUs d mi memory. Therefore, the performce cot e improved y icresig the umer of cores. As result, it is cosidered tht the executio time mootoiclly decreses y icresig the umer of CPUs. Tle. Computer eviromets SGI Altix4700 Cloud Computig System V. CONCLUSIONS CPUs 6 4 Cores - 6 I this pper, we descried the prllel d distriuted LIM-sed fst simultio method for lrge-scle lier
JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.0, NO., MARCH, 00 53 etworks. This method is very useful for the power distriutio etwork lysis. First, LIM ws riefly reviewed d it ws referred tht this method ws suitle for the prllel d distriuted computig. Next, the prllel-distriuted LIM ws costructed o the cloud computig system. Filly, it ws cofirmed tht the prllel-distriuted LIM o the cloud system ws very efficiet d the performce ws lmost idelly high ccordig to the umer of CPUs without losig ccurcy. Yut Ioue received the B.E. d M.E. degrees i system egieerig from Shizuok Uiversity, Hmmtsu, Jp, i 005 d 007, respectively. Curretly, he is workig towrd the Ph.D. degree i iformtio sciece d techology t Shizuok Uiversity. His reserch iterests re i the fst circuit simultio of the lrge itercoects d the power distriutio etworks (PDNs) of the chips d pckges. REFERENCES [] J. E. Schutt-Aié, Ltecy isertio method (LIM) for the fst trsiet simultio of lrge etworks, IEEE Trs. Circuit Syst. I, Vol.49, No., J., 00, pp.8-89. [] H. Kuot Y. Tji, T. Wte d H. Asi, Geerlized Method of the Time-Domi Circuit Simultio sed o LIM with MNA Formultio, Proc. CICC 005, Sep., 005, pp.89-9. [3] T. Wte, Y. Tji, H. Kuot d H. Asi, Prllel-Distriuted Time-Domi Circuit Simultio of Power Distriutio Networks with Frequecy- Depedet Prmeters, Proc. ASP-DAC 006, J., 006, pp.83-837. [4] T. Wte, Y. Tji, H. Kuot d H. Asi, Fst Trsiet Simultio of Power Distriutio Networks Cotiig Dispersio Bsed o Prllel -Distriuted Lepfrog Algorithm, IEICE Trs. Fudmetls, Vol.E90-A, No., Fe., 007, pp.388-397. [5] H. Asi d N. Tsuoi, Multi-Rte Ltecy Isertio Method with RLGC-MNA Formultio for Fst Trsiet Simultio of Lrge-Scle Itercoect d Ple Networks, Proc. ECTC007, Jue., 007, pp.667-67. [6] http://www.mpi-forum.org/ [7] http://ws.mzo.com/ec/ [8] Y. Ioue, T. Sekie, T. Hsegw d H. Asi, Fst Circuit Simultio Bsed o Prllel- Distriuted LIM usig Cloud Computig System, Proc. ITC-CSCC009, Jul., 009, pp.845-846. Tdtoshi Sekie received the B.E. d M.E. degrees i system egieerig from Shizuok Uiversity, Hmmtsu, Jp, i 007 d 009, respectively. Curretly, he is workig towrd the Ph.D. degree i iformtio sciece d techology t Shizuok Uiversity. His reserch iterests re i the fst circuit simultio of the lrge itercoects d the power distriutio etworks (PDNs) of the chips d pckges. Tkhiro Hsegw received the Ph.D. degrees i iformtio egieerig from Kyushu Istitute of Techology, Fukuok Jp, i 997. Sice 997, he hs ee with Shizuok Uiversity, Hmmtsu, Jp, where he is curretly Associte Professor ivolved with iformtio ifrstructure for the cmpus etwork d its security system icludig high performce computers d cloud computig.
54 YUTA INOUE et l : FAST CIRCUIT SIMULATION BASED ON PARALLEL-DISTRIBUTED LIM USING~ Hideki Asi received the B.E., M.E., d Ph.D. degrees i electricl egieerig from Keio Uiversity, Yokohm Jp, i 980, 98, d 985, respectively. I 985, he ws with the Deprtmet of Electricl d Electroics Egieerig, Sophi Uiversity, Tokyo, Jp. He ws Visitig Professor t Crleto Uiversity, Ottw ON, Cd d St Clr Uiversity, St Clr CA (999-000). Sice 986, he hs ee with Shizuok Uiversity, Hmmtsu, Jp, where he is curretly Professor ivolved with VLSI-CAD d electricl desig utomtio (EDA), log circuit desig, d eurl etworks. He is uthor of the ooks, Exercise Notes of Digitl Circuits, CORONA PUBLISHING. CO., LTD., 00 d Electroic Circuit Simultio Techiques, SCI TECHS PRESS, 003 Dr. Asi is memer of the IEEE Nolier Circuits d Systems Techicl Committee. He ws secretry for the IEEE Circuits d Systems Society Tokyo Chpter (994-995), d secretry of the Techicl Group o Nolier Prolems of the Istitute of Electroics, Iformtio d Commuictio Egieers (IEICE) (997-999). He ws chirm of the Techicl Group o Nolier Prolems of the IEICE (007-008) d chirm of the Techicl Group o System Pckgig CAE of JIEP (007-009), d is ow executive ord memer of JIEP. He ws the recipiet of the Reserch Ecourgemet Awrds o the occsio of the Tkygi iversry, the 50th iversry of the foudig of the IEICE Toki rch, o the occsio of the Sitoh iversry, d Prize for Sciece d Techology (Reserch Ctegory) wrded y Miister of Eductio, Culture, Sports, Sciece d Techology i 988, 989, 993, d 009, respectively.