Cache and Bandwidth Aware Matrix Multiplication on the GPU

Size: px
Start display at page:

Download "Cache and Bandwidth Aware Matrix Multiplication on the GPU"

Transcription

1 Cache ad Badwidth Aware Matrix Multiplicatio o the GPU Jesse D. Hall Natha A. Carr Joh C. Hart Uiversity of Illiois Astract Recet advaces i the speed ad programmaility of cosumer level graphics hardware has sparked a flurry of research that goes eyod the realm of image sythesis ad computer graphics. We examie the use of the GPU (graphics processig uit) as a tool for scietific computig, y aalyzig techiques for performig large matrix multiplies i GPU hardware. A earlier method for multiplyig matrices o the GPU suffered from prolems of memory adwidth. This paper examies more efficiet algorithms that make the implemetatio of large matrix multiplicatio o upcomig GPU architectures more competitive, usig oly 25% of the memory adwidth ad istructios of previous GPU algorithms. 1 Itroductio The multiplicatio of matrices is oe of the most cetral operatios applied i scietific computig. Recet history has show cotiued research for etter tued algorithms that improve the efficiecy of matrix multiplicatio. The ATLAS system for automatic tuig of matrix multiplicatio for target CPU s has show much success [Whaley et al. 2001]. New advaces i PC chip desig (such as streamig SIMD extesios) has led to research ito how to est leverage moder mico-architectures for this task [Aerdee ad Baxter 2000]. Recetly, cosumer ased graphics processors (GPU s) have ecome icreasigly more powerful ad are startig to support programmale features. The parallelism i graphics hardware pipelies makes the GPU a strog cadidate for performig may computatioal tasks icludig matrix multiplicatio [Larso ad McAllister 2001]. We detail a ew approach for multiplyig matrices o GPU hardware. This approach takes advatage of multiple levels of parallelism foud i moder GPU hardware ad reduces the adwidth requiremets ecessary to make this techique effective. The architecture of moder GPUs relevat to this paper is descried i Sectio 2. I summary, GPUs receive 3D graphics primitives (typically triagles or quadrilaterals) specified as a set of vertices from the applicatio. These vertices are trasformed ito scree coordiates, ad a fragmet is geerated for each pixel covered y the primitive (geeratig these fragmets is called rasterizatio). Fragmets cotai iformatio such as colors ad texture coordiates which are iterpolated across the primitive from values associated with the vertices. These auxiliary attriutes are used to shade each fragmet, which results i a fial color which is writte to a pixel i the frameuffer. Oe of the most commo ways of shadig fragmets is to use auxiliary attriutes kow as texture coordiates to idex ito a previously supplied image (texture). Multiple sets of texture coordiates ca e used to retrieve colors from multiple textures; the results are comied to form the fial color for the fragmet. This is multitexturig. Fially, a shaded fragmet ca either replace the curret value of the pixel i the frame uffer or it ca e added to the curret value. This versio of the graphics pipelie is descried i [Woo et al. 1997]. The performace of GPUs comes from the fact that large amouts of parallelism are availale i this pipelie. I particular, each fragmet is idepet of all other fragmets, so they ca e processed i parallel. Processig of fragmets ca also e overlapped to hide pipelie stalls ad memory latecies, resultig i very efficiet use of the hardware. Multitexturig ca e used to multiply matrices [Larso ad McAllister 2001]. A m matrix ca e represeted y a greyscale texture, with the each pixel cotaiig a elemet of the matrix 1. These matrices ca e displayed o the scree y drawig a m -pixel rectage with the texture coordiates (0, 0), (0, 1), (m 1, 1), (m 1, 0) assiged to the vertices (clockwise startig with the upper-left vertex). Oe eefit of this is that we ca access the traspose of a matrix y drawig a m-pixel rectagle with texture coordiates (0, 0), (m 1, 0), (m 1, 1), (0, 1). The exact same texture is used for drawig the matrix trasposed ad utrasposed. We have oly chaged the mappig of the texture image oto the rectagle. Matrix multiplicatio performs C AB where A is a m l-elemet matrix ad B is a l -elemet matrix. By storig matrices A ad B as textures, we ca compute C i l multitexturig passes as show i Figure 1. Clear the scree. Set the drawig mode to overlay. Load texture texa with matrix A. Load texture texb with matrix B. Set the multitexturig mode to modulate. Set frameuffer write mode to accumulate. for i 0...l 1 draw a m -pixel rectagle with texa coords (0,i), (0,i), (m 1,i), (m 1,i), ad texb coords (i, 0), (i, 1), (i, 1), (i, 0). Scree cotais result of A B. Figure 1: The Larso-McAllister multipass algorithm for multiplyig two matrices. The texture coordiates (0,i), (0,i), (m 1,i), (m 1,i) replicate the ith colum across the etire rectagle, whereas the texture coordiates (i, 0), (i, 1), (i, 1), (i, 0) replicate the ith row. These textured rectagles are the comied as demostrated i Figure 2. 1 Textures are typically idexed usig (s, t), with s idexig the horizotal axis ad t idexig the vertical axis. This is the opposite of stadard matrix otatio, where the first idex represets the row ad the secod idex represetig the colum. I the rest of this paper, we ll use the matrix style rather tha the texture style. Also, texture coordiates have traditioally ee i the rage [0...1], with various filters used to geerate a color for idices that fall etwee pixels. We use a extesio [Kilgard 2001] to allow iteger idexig, ad disale filterig.

2 Pass 1 Pass 2 Pass 3 Pass 4 A col. 1 A col. 2 A col. 3 A col. 4 Brow1 Brow2 Brow3 Brow4 fragmet processor that descries the color each fragmet efore it is possily assiged to its correspodig pixel. The iputs to a fragmet shader are a set of program costats, iterpolated attriute data from triagle vertices, ad texture maps (locks of texture memory addressed y the texture coordiates). Before rerig, the fragmet shader is compiled ad loaded ito the graphics hardware. Primitives (i our case quadrilaterals) are the set dow the graphics pipelie ivokig the ealed fragmet shader as they are rasterized. The output of a fragmet shader is a output color plotted to the scree. A fragmet shader is alloted a fixed set of temporary registers R0...R. Each register holds a sigle 4-vector correspodig to the four color chaels red, gree, lue, alpha. The color chaels of each register may e accessed idividually as follows: Ri.c where c {r, g,, a}, i {1..}. Stadard arithmetic operatios are defied over the set of registers, such as additio ad multiplicatio. For example: R2. R1.a R0.g, assigs the lue chael of register R2 to e the sum of the alpha chaels of registers R1 with the gree chael of R0. Moder fragmet shaders allow for up to four-istructio to e executed simultaeously, much like that of the SIMD istructios foud i moder PC architectures. For example: R2 R1.agr R0.gga (1) Result A Figure 2: Demostratio of Larso-McAllister matrix multiplicatio o a pair of 4 4 matrices. (The output i this example is saturated, such that results greater tha oe appear uiformly white.) 2 Moder GPU Orgaizatio The graphics pipelie implemeted o graphics acceleratio hardware was classically orgaized i a series of trasformatios, clippig ad rasterizatio steps. Moder graphics hardware has geeralized this pipelie ito programmale elemets. The moder graphics pipelie cosists of vertex processig, rasterizatio ad fragmet processig. The vertex processor performs operatios o the idividual vertices of triagles set to the graphics accelerator. Oce trasformed, these triagles are rasterized ito a collectio of pixels. Each pixel output y the rasterizer is called a fragmet. The rasterizatio process liearly iterpolates attriutes, such as texture coordiates, stored at the vertices ad stores the iterpolated values at each fragmet. A fragmet processor uses the iterpolated texture coordiates to lookup texture values from texture memory, ad ca perform specialpurpose arithmetic operatios o oth the texture addresses ad the fetched texture values. The vertex processor is structured similarly to vector processors (pipelie o a stream of vertices), whereas the fragmet processor is structured similarly to a SIMD array processor (oe processor per pixel). Our experimets have show that the vertex processor does ot provide much advatage over existig CPU capailities, whereas the fragmet processor already outperforms the CPU o some operatios like ray-triagle itersectios [Carr et al. 2002]. Because these processors were origially developed for texturig, the programs the GPU executes are called shaders. A fragmet shader is a program executed y the B ca e issued as a sigle GPU istructio which refers to four simultaeous multiplicatios umerically equivalet to: R2.r R1.a R0.g R2.g R1. R0.g R2. R1.g R0.a R2.a R1.r R0. (2) The SIMD ature of the operatio defied i (1) allows for four additios to occur i parallel, takig oe fourth the computatio time of (2). I equatio (1), R1 s color chaels are refereced i aritrary order. This is referred to as swizzlig. The gree chael of R0 is refereced multiple times. This is kow as smearig. Aritrary swizzlig ad smearig (ad also egatio) of iput operads ca e doe with o performace pealty. This is i cotrast to the Itel s SSE istructios, where movig data etwee chaels requires additioal istructios. The output color of a fragmet program is placed i a desigated register (usually R0) upo termiatio of the program. This value is writte to the fragmet s scree locatio i the frame uffer. Fragmet shaders have access to three kids of data: costats, iterpolated vertex attriute data (e.g. texture coordiates), ad texture data (idexed y texture coordiates). Iterpolated vertex attriutes data are accessed as the registers T 0...Tm where m 1 is the umer of attriutes stored with each vertex. Data is fetched from texture memory y the lookup() operatio. For example, R0 lookup(t 0.r, T 0.g, M) uses the first two coordiates of T 0 to access the texture M. We ca also perform arithmetic o the texture coordiate efore the fetch, or we ca use the result of oe texture fetch as the coordiates for a secod texture fetch (depet texturig). Although fragmet shaders provide a very powerful SIMD model for programmig, they are curretly limited i a umer of ways. Moder implemetatios restrict the umer of availale registers, the total istructio legth, ad the

3 umer of lookup() operatios that may occur i a give fragmet shader. Cotrol flow is also restricted i fragmet shadig. For example, rachig is ot supported ad coditioal executio is limited to predicatig istructios o previously set coditio codes. Our model for a fragmet shaders is ased o the oe descried y the upcomig DirectX 9.0 specificatio [Marshall 2001]. This model provides capailities curretly foud i vertex shaders at the fragmet shader level. This model has also ee used to descrie the implemetatio of a ray tracer as a fragmet shader [Purcell et al. 2002]. This paper assumes similar fragmet processor capailities, specifically fragmet shaders of up to 256 istructios, a urestricted umer of texture access operatios, a set of at least six registers, ad stadard sigle-precisio floatig poit data formats. for k 1.. step for i 1... for j 1... fragmet shader R3.r 0 for m k...k 1 R1.r lookup(i, m, X) R2.r lookup(m, j, Y ) R3.r R3.r R1.r R2.r R4.r lookup(i, j, F ) R0.r R3.r R4.r Copy frame uffer ito texture F (5) 3 Cache Aware Matrix Multiply Suppose we are muliplyig two large matrices X ad Y, wlog who s dimesios are a perfect power of two, with 2 i rows ad 2 i colums. A geeral algorithm for computig Z XY ca e expressed as follows: for i 1... for j 1... Z ij 0 for m 1... Z ij Z ij X im Y mj The outer two loops are implemeted o the GPU y rerig a sigle scree fillig quadrilateral. This implies that everthig withi the outer two loops must e hadled y the fragmet shader. Below we have iserted pixel shader pseudo-code i the appropriate places. Matrices X ad Y are ow assumed to e stored i sigle chael texture maps ad accessed through the fragmet shader lookup() operatio. for i 1... for j 1... fragmet shader R3.r 0 for m 1... R1.r lookup(i, m, X) R2.r lookup(m, j, Y ) R0.r R0.r R1.r R2.r The aove psuedo-code i (4) requires that either loops are availale i fragmet shaders or that the fragmet shader istructio cout is log eough to allow the iermost loop to e urolled. We assume either are realistic assumptios. To address this issue, we tur to a stadard lockig strategy. Blockig has ee show to improve cache performace, ut for this applicatio lockig also serves the purpose of allowig us to work withi the costraits of our fragmet programmig model. The psuedo-code for our lockig strategy is show i (5). A ew matrix F is itroduced that is iitialized to e all zeroes ad used as a temporary store y the routie. The value is a scalar represetig the lock size. (3) (4) This ew algorithm is a multipass method requirig multiple rerigs to the frame uffer. The outer three loops are hadled y rerig scree fillig quadrilateral to the frame uffer / times. Betwee each of the / passes, the frame uffer is copied ito texture map F to e accumulated with result of the ext pass. This copy operatio is required sice moder GPU hardware does ot support direct lookup operatios o the frameuffer. Some graphics hardware does however, support lig modes allowig fragmet values to accumulate directly with the cotets of the frame uffer elimiatig the eed for the temporary texture F, ad cosequetly more efficiet rerig. The fragmet program (which covers the portio iside the j loop), ca ow e urolled y choosig a appropriate value of. For our tests we have chose 32. We were ale to reduce our total fragmet program istructio cout to e four istructios per iteratio of the m loop, for a total of total istructios. 4 Multi-Chael GPU Matrix Multiplies Texture map sizes o moder day GPU s are ofte restricted. Let eig the maximum size of ay dimesio. NVidia s GeForce4 Ti4600 has a maximum allowale rerale size of 2048, limitig multipass programs to two-dimesioal textures cotaiig at most elemets. Texture maps may cosist of etwee oe ad four chaels (lumiace, lumiace-alpha, RGB or RGBA). This implies that the GeForce4 ca hadle textures i size up to Methods have already ee preseted for hadlig matrices whose dimesios are at most 2048 usig sigle chael texture maps. It is aturally desirale to e ale to hadle matrices of larger sizes. The GeForce4 for example should e ale to multiply matrices i size y utilizig all four of the color chaels. This sectio descries a matrix multiplicatio algorithm that takes advatage of this four-compoet storage capaility. These four-chael textures store matrices of it floatig poit values, ad occupy 64MB. Our GPU matrix multiplicatio implemetatio requires four times this space for storig the two operads, a temporary store, ad the result. Curret cosumer level GPU s such as the GeForce4 curretly ship with 128MB of o-oard memory, suggestig a maximum capale matrix side of Exceedig this memory threshold uder a o-uified memory architecture of preset-day PC GPU s results i pagig to mai system memory, ad icreased traffic over the graphics card us.

4 4.1 Basic Formulatio Suppose we are muliplyig two large matrices X ad Y, wlog whose dimesios are a perfect power of two, with 2 i rows ad 2 i colums. XY Z (6) The matrix multiply i (6) ca e expressed as the followig series of matrix multiplies of smaller matrices X Y {[ }} ]{ {[ }} ]{ {[ }} ]{ A B E F AE BG AF BH (7) C D G H CE DG CF DH Elemets A, B, C, D, E, F, G, ad H are su-matrices decomposig X ad Y. Let the dimesios of A...H e 2 i 1 rows y 2 i 1 colums. 4.2 Blocked Matrix Texture Maps We ca store matrices X ad Y as texture maps i a 2 i 1 y 2 i 1 sized texture maps o the GPU, y placig the four su-matrices i the differet color chaels RGBA as X ( ) Ar B g,y C D a Z ( ) Er F g. (8) G H a We ow itroduce suscript otatio o matrices M, such that M i for i r, g,, a refers to the sumatrices composig M. For example X ry g AF. We have preseted our techique for multiplyig matrices cotaied i a sigle color chael X ry r Z r i Sectio 3. We ca ext this same approach to a SIMD otatio y usig multiple suscripts r, g,, a. For example: ( ) AEr AF X rr Y rgrg g. (9) CE CF a The aove otatio assumes a architecture where four su-matrices may e operated o i parallel y a sigle istructio. This otatio is useful sice graphics hardware is desiged i a SIMD maer to work simulateously o four color chaels at a time. Usig this otatio, we ca ow cocisely express matrix multiplicatio X ad Y as follows: X rga Y rga X rr Y rgrg X ggaay aa ( ) ( ) AEr AF g BGr BH g CE CF a DG DH a Z rga (10) 4.3 The Multi-Chael Algorithm To apply the formulatio derived i (10) ito a algorithm usale y graphics hardware, we must first re-examie fragmet shader programmig. As discussed i Sectio 1, a sigle lookup() operatio ca retrieve a 4-vector correspodig the four color chaels. If we store X as a texture map i locked form (8) the a sigle lookup R0 lookup(i, j, X) ca e used to retrieve four values comig from the sumatrices of R0 A ij,b ij,c ij,d ij. The four matrix multiplies from equatio (10) may ow e parallelized withi a sigle fragmet shader. The matrix swizzlig ad smearig suggested y (10) is hadled at the per-elemet level utilizig the capailities of graphics hardware, as show i (11). for k 1.../2step for i 1.../2 for j 1.../2 fragmet shader R3 0 for m k...k 1 R1 lookup(i, m, X) R2 lookup(m, j, Y ) R3 R1.rr R2.rgrg R3 R3 R1.ggaa R2.aa R3 R4 lookup(i, j, F ) R0 R3R4 copy frame uffer ito texture F (11) The aove algorithm represets a efficiet use of the SIMD computatio power of the GPU y workig o all four color chaels i parallel. This implemetatio oly icreases the fragmet shader istructio cout y oe per iteratio of m, thus resultig i a total fragmet program legth of with Aalysis We have aalyzed the ew locked ad multichael GPU matrix multiplicatio algorithms with respect to memory adwidth, istructio cout ad predicted performace. 5.1 Badwidth Cosideratios To aalyze the potetial adwidth limitatios for our approach, we first distiguish etwee the two adwidth limited areas of moder GPUs. The exteral adwidth is the rate at which data may e trasferred etwee the GPU ad the mai system memory. O moder PC s this is limited y the speed of the AGP graphics us which ca traser data at the rate of 1GB/sec. The iteral adwith is the rate at which the GPU may read ad write from its ow iteral memory. The GeForce4 Ti 4600 is curretly capale of trasferrig 10.4 GB/sec. For our applicatio the exteral adwidth of the GPU affects our applicatio i two areas. First, the matrices must e copied ito the GPU s memory as texture maps, ad the result of the computatio must e read ack from the card ito mai memory. These trasfers use the AGP us, which curretly has a theoretical adwidth of aout 1 GB/s (for AGP 4x). However, i practice sig data to the GPU is much faster tha readig data ack from the GPU (sice the hardware ad drivers are optimized for this case), ad the est speed we ve measured for readig data ack ito host memory is 175 MB/s. Eve assumig a average of 200 MB/s trasfer for oth reads ad writes, trasferig two sigle-precisio matrices to the GPU ad readig the result ack requires 60 ms. At 4 GFLOPS the actual computatio takes aout 510 ms. For this prolem, trasfer time is aout 11% of the total time. Thus, exteral adwidth is a sigificat ut ot overwhelmig fractio of the total time. The secod part of the algorithm affected y exteral adwidth is the time to s the geometry, the scree fillig quads o which the matrices are texture mapped. I fact this is a very small amout of data (aout 48 ytes per pass), ad graphics hardware is very good at trasferrig

5 geometry iformatio i parallel with other tasks, icludig ruig fragmet shaders. Thus, this cost is egligile. Oe of the primary ottleecks i performig matrix multiplies o the GPU is the iteral adwidth [Larso ad McAllister 2001]. This is also true for CPU implemetatios. For our aalysis we cosider mutiplyig two matrices. For oth the multi-chael ad sigle-chael lock-matrix approaches, the processig of each fragmet requires two texture lookup() operatios per iteratio of the ier m loop, plus a additioal lookup() to comie it with the results from the previous pass. A sigle write to output occurs per fragmet per pass as its result is writte to the frame uffer. Thus, there are 2 2 memory operatios per fragmet per pass. I our sigle chael method, every memory operatio trasfers 4 ytes of data. Our multichael method trasfers 16 ytes of data (4 chaels, 4 ytes per chael) per memory operatio. method passes frags/pass ytes/frag total L-M sigle multi 2 (2 2)4 2 ( 2 )2 (2 2) (1) 4 3 (1) Tale 1: Bytes trasferred iterally y each GPU matrix multiplicatio method. Tale 1 summarizes these results ad shows the total iteral adwidth i ytes trasferred y each method, which is just the product of the umer of passes, the fragmets per pass ad the ytes per fragmet. The L-M (Larso- McAllister) figures assume a implemetatio ased o a sigle floatig-poit chael. (Eve though multiple chaels were metioed, [Larso ad McAllister 2001] did ot descrie a multi-chael implemetatio.) The four-yte floats are accessed four times (two matrices, a temporary store ad a result) for a total of 16 ytes trasferred per fragmet. Our sigle-chael lock matrix algorithm performs idetically to L-M whe is set to oe (o lockig). As the lockig size grows, the adwidth drops y early a factor of two whe compared to L-M. The multi-chael method further reduces the memory adwidth y exactly oe half over the sigle chael method, reducig the adwidth to early 25% of L-M. 5.2 Istructios The GPU uses the fact that the same istructios are eig executed for a large umer of fragmets to overlap the processig of differet fragmets. As o commuicatio etwee executios of the fragmet shaders is eeded, a large amout of parallelism is availale. This parallelism is used to hide the latecy of memory operatios ad other causes of stalls. As a result, whe eough fragmets are availale (as i our case), the ruig time of a fragmet shader is approximately liear i the umer of istructios executed (assumig computatio is the limitig factor). Therefore, it makes sese to aalyze the umer of istructios used y our algorithm. Each executio of the fragmet shader eeds four istructios for setup ad addig i the result of the previous pass. The multi-chael algorithm also eeds three istructios per iteratio of the ier loop (two multiply-add istructios ad oe additio to update the texture idices). The sigle-chael algorithm removes oe of the multiply-add istructios. Therefore, the istructio couts are 3 4 for the multi-chael case ad 2 4 for the sigle-chael case. Note that i the multi-chael case, each may of the istructios ivolve a 4-wide data issue, operatig o the four color chaels i parallel. Curretly there is o performace pealty for this sice moder GPU s desiged to atively work o multiple chaels. Tale 2 summarizes the aalysis ad the total GPU floatig poit istructios required y each method. method passes frags/pass ist/frag total ist L-M sigle multi ( 2 ) (24) 3 (34) 8 Tale 2: Floatig poit operatios required y each GPU matrix multiplicatio method. The additioal fragmet program overhead makes the sigle chael lock-structured matrix multiplicatio loger tha the L-M algorithm. As icreases, the istructio cout asymptotically approaches that of the L-M algorithm. The multi-chael method executes 3/16ths as may istructios of either of the sigle chael methods as the lock size grows. 5.3 Performace ATLAS has demostrated 4.0 GFLOPS/s for matrix multiplicatio o a 1.5GHz Petium4 usig Itel s SSE2 SIMD istructios [Dogarra 2001]. Is the GPU comparale to this? For a matrix size of ad a lock size of 32 ( 1024, 32), our multi-chael algorithm trasfers GB of data. I order to match the ATLAS P4 SSE umers, we eed to perform this multiplicatio i 0.5 secods. This meas we will eed 8.25 GB/s of adwidth. Curret hardware has a theoretical adwidth of 10.4 GB/s to the mai memory; as with CPUs, it also has a cache etwee the GPU ad memory which supports much higher adwidth. Thus, existig hardware should e ale to support our adwidth eeds. Future hardware is likely to improve oth cache size ad performace ad memory adwidth. Performace of CPU implemetatios of matrix multiplicatio are typically limited y memory adwidth, ot CPU speed. Larso ad McAllister [Larso ad McAllister 2001] reported similar results with their GPU implemetatio. However, due to much lower clock speeds o CPUs relative to CPUs, the move from fixed-poit yte operatios to floatig poit may icrease the processig requiremets eough to make the GPU speed the ottleeck. 6 Coclusio We have preseted a multichael lock-ased GPU matrix multiplicatio algorithm. The lock structured approach should yield greater cache coherece tha previous methods. We also demostrated that our implemetatio uses oly aout 25% of the memory adwidth ad istructios whe compared to the previous method. Our results are curretly theoretical as we aticipate the implemetatio of graphics hardware that supports the DirectX 9.0 stadard. We expect such hardware will e availale efore the fial versio of this paper is required. (For example, as of this writig, 3Dlas has just aouced a processor, the P10, that partially satisfies upcomig stadards.)

6 We will the e ale to provide actual implemetatio times comparig our cache ad adwidth aware algorithms to the previous work [Larso ad McAllister 2001]. We have made umerous assumptios aout the performace of the upcomig hardware. These assumptios are ased o the speed of existig hardware, ut with the geerality to hadle the upcomig stadards. These simulatios ad aalyses suggest our method to e competitive with moder CPU implemetatios. The existece of hardware implemeatios will allow us to perform further tuig ad validate our claims with empirical data. Availale hardware would allow us to automatically tue our algorithm for a give GPU i much the same maer as performed y Altas. Emperical tests may e ru to provide searches over algorithm s parameter space to select the est aglorithm for a target GPU. Our algorithm is curretly parameterized y its lock size, ut we could also itroduce additioal parameters to cotrol order of rasterizatio ad lockig of memory layouts. The icreased memory adwidth ad SIMD orgaizatio of the GPU should make it a good choice for scietific applicatios. We have oetheless foud that the GPU remais aout as powerful as the CPU o actual tasks like matrix multiplicatio ad ray tracig [Carr et al. 2002]. This has ee disappoitig give the icreased adwidth ad processig power of the GPU. The costraits of GPU programmig coupled with the low adwidth coectio from the GPU ack ito the CPU have ee major ostacles i the capitalizatio of the GPU for scietific applicatios. We are oetheless ecouraged y the potetial of the GPU. Whereas future ehacemets to the CPU explore parallelism through speculative executio ad other proailistic methods, the GPU ca exploit parallelism across the frame uffer ad across the geometric data. This has ee partially resposile for the domiace of the GPU performace growth rate over that of the CPU. As GPU growth cotiues to outpace CPU growth, we expect the GPU will ecome the preferred platform for persoal highperformace scietific computig. Larso, S. E., ad McAllister, D Fast matrix multiplies usig graphics hardware. Super Computig (Nov.). Marshall, B DirectX graphics future. Microsoft DirectX Meltdow 2001 (Jul.). Purcell, T. J., Buck, I., Mark, W. R., ad Haraha, P Ray tracig o programmale graphics hardware. I Proceedigs of SIGGRAPH 2002, ACM Press / ACM SIGGRAPH, J. F. Hughes, Ed., Computer Graphics Proceedigs, Aual Coferece Series, ACM. Whaley, R. C., Petitet, A., ad Dogarra, J Automated empirical optimizatios of software ad the ATLAS project. Parallel Computig 27, 1-2, Woo, M., Neider, J., Davis, T., ad Shreier, D OpeGL Programmig Guide. Addiso-Wesley, Readig, MA, USA. Ackowledgmets This research was supported i part y the NSF uder the ITR grat ACI , ad y NVidia Corp. Coversatios with Jack Dogarra ad Jim Demmel (ad his studets) were also quite helpful. Refereces Aerdee, D., ad Baxter, J Geeral matrixmatrix multiplicatio usig SIMD features of the PIII (research ote). I Europea Coferece o Parallel Processig, Carr, N. A., Hall, J. D., ad Hart, J. C The ray egie. Tech. Rep. UIUCDCS-R , Uiversity of Illiois at Uraa-Champaig, Mar. Dogarra, J A update of a couple of tools: AT- LAS ad PAPI. DOE Salisha Meetig (Availale from SLIDES/salisha.ps), Apr. Kilgard, M. J GL NV texture rectagle. cotet/vopeglspecs/ GL NV texture rectagle.txt.

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,

More information

Incremental calculation of weighted mean and variance

Incremental calculation of weighted mean and variance Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically

More information

Modified Line Search Method for Global Optimization

Modified Line Search Method for Global Optimization Modified Lie Search Method for Global Optimizatio Cria Grosa ad Ajith Abraham Ceter of Excellece for Quatifiable Quality of Service Norwegia Uiversity of Sciece ad Techology Trodheim, Norway {cria, ajith}@q2s.tu.o

More information

THE ARITHMETIC OF INTEGERS. - multiplication, exponentiation, division, addition, and subtraction

THE ARITHMETIC OF INTEGERS. - multiplication, exponentiation, division, addition, and subtraction THE ARITHMETIC OF INTEGERS - multiplicatio, expoetiatio, divisio, additio, ad subtractio What to do ad what ot to do. THE INTEGERS Recall that a iteger is oe of the whole umbers, which may be either positive,

More information

CS100: Introduction to Computer Science

CS100: Introduction to Computer Science Review: History of Computers CS100: Itroductio to Computer Sciece Maiframes Miicomputers Lecture 2: Data Storage -- Bits, their storage ad mai memory Persoal Computers & Workstatios Review: The Role of

More information

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008 I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces

More information

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here). BEGINNING ALGEBRA Roots ad Radicals (revised summer, 00 Olso) Packet to Supplemet the Curret Textbook - Part Review of Square Roots & Irratioals (This portio ca be ay time before Part ad should mostly

More information

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed. This documet was writte ad copyrighted by Paul Dawkis. Use of this documet ad its olie versio is govered by the Terms ad Coditios of Use located at http://tutorial.math.lamar.edu/terms.asp. The olie versio

More information

Cooley-Tukey. Tukey FFT Algorithms. FFT Algorithms. Cooley

Cooley-Tukey. Tukey FFT Algorithms. FFT Algorithms. Cooley Cooley Cooley-Tuey Tuey FFT Algorithms FFT Algorithms Cosider a legth- sequece x[ with a -poit DFT X[ where Represet the idices ad as +, +, Cooley Cooley-Tuey Tuey FFT Algorithms FFT Algorithms Usig these

More information

Quantitative Computer Architecture

Quantitative Computer Architecture Performace Measuremet ad Aalysis i Computer Quatitative Computer Measuremet Model Iovatio Proposed How to measure, aalyze, ad specify computer system performace or My computer is faster tha your computer!

More information

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth Questio 1: What is a ordiary auity? Let s look at a ordiary auity that is certai ad simple. By this, we mea a auity over a fixed term whose paymet period matches the iterest coversio period. Additioally,

More information

Domain 1: Designing a SQL Server Instance and a Database Solution

Domain 1: Designing a SQL Server Instance and a Database Solution Maual SQL Server 2008 Desig, Optimize ad Maitai (70-450) 1-800-418-6789 Domai 1: Desigig a SQL Server Istace ad a Database Solutio Desigig for CPU, Memory ad Storage Capacity Requiremets Whe desigig a

More information

Department of Computer Science, University of Otago

Department of Computer Science, University of Otago Departmet of Computer Sciece, Uiversity of Otago Techical Report OUCS-2006-09 Permutatios Cotaiig May Patters Authors: M.H. Albert Departmet of Computer Sciece, Uiversity of Otago Micah Colema, Rya Fly

More information

DAME - Microsoft Excel add-in for solving multicriteria decision problems with scenarios Radomir Perzina 1, Jaroslav Ramik 2

DAME - Microsoft Excel add-in for solving multicriteria decision problems with scenarios Radomir Perzina 1, Jaroslav Ramik 2 Itroductio DAME - Microsoft Excel add-i for solvig multicriteria decisio problems with scearios Radomir Perzia, Jaroslav Ramik 2 Abstract. The mai goal of every ecoomic aget is to make a good decisio,

More information

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 8

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 8 CME 30: NUMERICAL LINEAR ALGEBRA FALL 005/06 LECTURE 8 GENE H GOLUB 1 Positive Defiite Matrices A matrix A is positive defiite if x Ax > 0 for all ozero x A positive defiite matrix has real ad positive

More information

CS100: Introduction to Computer Science

CS100: Introduction to Computer Science I-class Exercise: CS100: Itroductio to Computer Sciece What is a flip-flop? What are the properties of flip-flops? Draw a simple flip-flop circuit? Lecture 3: Data Storage -- Mass storage & represetig

More information

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN Aalyzig Logitudial Data from Complex Surveys Usig SUDAAN Darryl Creel Statistics ad Epidemiology, RTI Iteratioal, 312 Trotter Farm Drive, Rockville, MD, 20850 Abstract SUDAAN: Software for the Statistical

More information

(VCP-310) 1-800-418-6789

(VCP-310) 1-800-418-6789 Maual VMware Lesso 1: Uderstadig the VMware Product Lie I this lesso, you will first lear what virtualizatio is. Next, you ll explore the products offered by VMware that provide virtualizatio services.

More information

Your organization has a Class B IP address of 166.144.0.0 Before you implement subnetting, the Network ID and Host ID are divided as follows:

Your organization has a Class B IP address of 166.144.0.0 Before you implement subnetting, the Network ID and Host ID are divided as follows: Subettig Subettig is used to subdivide a sigle class of etwork i to multiple smaller etworks. Example: Your orgaizatio has a Class B IP address of 166.144.0.0 Before you implemet subettig, the Network

More information

A Combined Continuous/Binary Genetic Algorithm for Microstrip Antenna Design

A Combined Continuous/Binary Genetic Algorithm for Microstrip Antenna Design A Combied Cotiuous/Biary Geetic Algorithm for Microstrip Atea Desig Rady L. Haupt The Pesylvaia State Uiversity Applied Research Laboratory P. O. Box 30 State College, PA 16804-0030 haupt@ieee.org Abstract:

More information

Lesson 17 Pearson s Correlation Coefficient

Lesson 17 Pearson s Correlation Coefficient Outlie Measures of Relatioships Pearso s Correlatio Coefficiet (r) -types of data -scatter plots -measure of directio -measure of stregth Computatio -covariatio of X ad Y -uique variatio i X ad Y -measurig

More information

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies ( 3.1.1) Limitations of Experiments. Pseudocode ( 3.1.2) Theoretical Analysis

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies ( 3.1.1) Limitations of Experiments. Pseudocode ( 3.1.2) Theoretical Analysis Ruig Time ( 3.) Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects.

More information

Systems Design Project: Indoor Location of Wireless Devices

Systems Design Project: Indoor Location of Wireless Devices Systems Desig Project: Idoor Locatio of Wireless Devices Prepared By: Bria Murphy Seior Systems Sciece ad Egieerig Washigto Uiversity i St. Louis Phoe: (805) 698-5295 Email: bcm1@cec.wustl.edu Supervised

More information

CHAPTER 3 THE TIME VALUE OF MONEY

CHAPTER 3 THE TIME VALUE OF MONEY CHAPTER 3 THE TIME VALUE OF MONEY OVERVIEW A dollar i the had today is worth more tha a dollar to be received i the future because, if you had it ow, you could ivest that dollar ad ear iterest. Of all

More information

A Secure Implementation of Java Inner Classes

A Secure Implementation of Java Inner Classes A Secure Implemetatio of Java Ier Classes By Aasua Bhowmik ad William Pugh Departmet of Computer Sciece Uiversity of Marylad More ifo at: http://www.cs.umd.edu/~pugh/java Motivatio ad Overview Preset implemetatio

More information

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature.

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature. Itegrated Productio ad Ivetory Cotrol System MRP ad MRP II Framework of Maufacturig System Ivetory cotrol, productio schedulig, capacity plaig ad fiacial ad busiess decisios i a productio system are iterrelated.

More information

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample

More information

Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling

Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling Taig DCOP to the Real World: Efficiet Complete Solutios for Distributed Multi-Evet Schedulig Rajiv T. Maheswara, Milid Tambe, Emma Bowrig, Joatha P. Pearce, ad Pradeep araatham Uiversity of Souther Califoria

More information

CHAPTER 3 DIGITAL CODING OF SIGNALS

CHAPTER 3 DIGITAL CODING OF SIGNALS CHAPTER 3 DIGITAL CODING OF SIGNALS Computers are ofte used to automate the recordig of measuremets. The trasducers ad sigal coditioig circuits produce a voltage sigal that is proportioal to a quatity

More information

Basic Elements of Arithmetic Sequences and Series

Basic Elements of Arithmetic Sequences and Series MA40S PRE-CALCULUS UNIT G GEOMETRIC SEQUENCES CLASS NOTES (COMPLETED NO NEED TO COPY NOTES FROM OVERHEAD) Basic Elemets of Arithmetic Sequeces ad Series Objective: To establish basic elemets of arithmetic

More information

ODBC. Getting Started With Sage Timberline Office ODBC

ODBC. Getting Started With Sage Timberline Office ODBC ODBC Gettig Started With Sage Timberlie Office ODBC NOTICE This documet ad the Sage Timberlie Office software may be used oly i accordace with the accompayig Sage Timberlie Office Ed User Licese Agreemet.

More information

Soving Recurrence Relations

Soving Recurrence Relations Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree

More information

Chapter 10 Computer Design Basics

Chapter 10 Computer Design Basics Logic ad Computer Desig Fudametals Chapter 10 Computer Desig Basics Part 1 Datapaths Charles Kime & Thomas Kamiski 2004 Pearso Educatio, Ic. Terms of Use (Hyperliks are active i View Show mode) Overview

More information

BINOMIAL EXPANSIONS 12.5. In this section. Some Examples. Obtaining the Coefficients

BINOMIAL EXPANSIONS 12.5. In this section. Some Examples. Obtaining the Coefficients 652 (12-26) Chapter 12 Sequeces ad Series 12.5 BINOMIAL EXPANSIONS I this sectio Some Examples Otaiig the Coefficiets The Biomial Theorem I Chapter 5 you leared how to square a iomial. I this sectio you

More information

Chatpun Khamyat Department of Industrial Engineering, Kasetsart University, Bangkok, Thailand ocpky@hotmail.com

Chatpun Khamyat Department of Industrial Engineering, Kasetsart University, Bangkok, Thailand ocpky@hotmail.com SOLVING THE OIL DELIVERY TRUCKS ROUTING PROBLEM WITH MODIFY MULTI-TRAVELING SALESMAN PROBLEM APPROACH CASE STUDY: THE SME'S OIL LOGISTIC COMPANY IN BANGKOK THAILAND Chatpu Khamyat Departmet of Idustrial

More information

Escola Federal de Engenharia de Itajubá

Escola Federal de Engenharia de Itajubá Escola Federal de Egeharia de Itajubá Departameto de Egeharia Mecâica Pós-Graduação em Egeharia Mecâica MPF04 ANÁLISE DE SINAIS E AQUISÇÃO DE DADOS SINAIS E SISTEMAS Trabalho 02 (MATLAB) Prof. Dr. José

More information

I. Chi-squared Distributions

I. Chi-squared Distributions 1 M 358K Supplemet to Chapter 23: CHI-SQUARED DISTRIBUTIONS, T-DISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad t-distributios, we first eed to look at aother family of distributios, the chi-squared distributios.

More information

Shared Memory with Caching

Shared Memory with Caching Vorlesug Recherarchitektur 2 Seite 164 Cachig i MIMD-Architectures ] MIMD-Architekture Programmiermodell Behadlug der Kommuikatioslatez Nachrichteorietiert globaler Adressraum Latez miimiere Latez verstecke

More information

How To Solve The Homewor Problem Beautifully

How To Solve The Homewor Problem Beautifully Egieerig 33 eautiful Homewor et 3 of 7 Kuszmar roblem.5.5 large departmet store sells sport shirts i three sizes small, medium, ad large, three patters plaid, prit, ad stripe, ad two sleeve legths log

More information

Output Analysis (2, Chapters 10 &11 Law)

Output Analysis (2, Chapters 10 &11 Law) B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should

More information

Lecture 13. Lecturer: Jonathan Kelner Scribe: Jonathan Pines (2009)

Lecture 13. Lecturer: Jonathan Kelner Scribe: Jonathan Pines (2009) 18.409 A Algorithmist s Toolkit October 27, 2009 Lecture 13 Lecturer: Joatha Keler Scribe: Joatha Pies (2009) 1 Outlie Last time, we proved the Bru-Mikowski iequality for boxes. Today we ll go over the

More information

Project Deliverables. CS 361, Lecture 28. Outline. Project Deliverables. Administrative. Project Comments

Project Deliverables. CS 361, Lecture 28. Outline. Project Deliverables. Administrative. Project Comments Project Deliverables CS 361, Lecture 28 Jared Saia Uiversity of New Mexico Each Group should tur i oe group project cosistig of: About 6-12 pages of text (ca be loger with appedix) 6-12 figures (please

More information

5.4 Amortization. Question 1: How do you find the present value of an annuity? Question 2: How is a loan amortized?

5.4 Amortization. Question 1: How do you find the present value of an annuity? Question 2: How is a loan amortized? 5.4 Amortizatio Questio 1: How do you fid the preset value of a auity? Questio 2: How is a loa amortized? Questio 3: How do you make a amortizatio table? Oe of the most commo fiacial istrumets a perso

More information

INVESTMENT PERFORMANCE COUNCIL (IPC)

INVESTMENT PERFORMANCE COUNCIL (IPC) INVESTMENT PEFOMANCE COUNCIL (IPC) INVITATION TO COMMENT: Global Ivestmet Performace Stadards (GIPS ) Guidace Statemet o Calculatio Methodology The Associatio for Ivestmet Maagemet ad esearch (AIM) seeks

More information

Properties of MLE: consistency, asymptotic normality. Fisher information.

Properties of MLE: consistency, asymptotic normality. Fisher information. Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout

More information

Biology 171L Environment and Ecology Lab Lab 2: Descriptive Statistics, Presenting Data and Graphing Relationships

Biology 171L Environment and Ecology Lab Lab 2: Descriptive Statistics, Presenting Data and Graphing Relationships Biology 171L Eviromet ad Ecology Lab Lab : Descriptive Statistics, Presetig Data ad Graphig Relatioships Itroductio Log lists of data are ofte ot very useful for idetifyig geeral treds i the data or the

More information

5 Boolean Decision Trees (February 11)

5 Boolean Decision Trees (February 11) 5 Boolea Decisio Trees (February 11) 5.1 Graph Coectivity Suppose we are give a udirected graph G, represeted as a boolea adjacecy matrix = (a ij ), where a ij = 1 if ad oly if vertices i ad j are coected

More information

Listing terms of a finite sequence List all of the terms of each finite sequence. a) a n n 2 for 1 n 5 1 b) a n for 1 n 4 n 2

Listing terms of a finite sequence List all of the terms of each finite sequence. a) a n n 2 for 1 n 5 1 b) a n for 1 n 4 n 2 74 (4 ) Chapter 4 Sequeces ad Series 4. SEQUENCES I this sectio Defiitio Fidig a Formula for the th Term The word sequece is a familiar word. We may speak of a sequece of evets or say that somethig is

More information

1 Computing the Standard Deviation of Sample Means

1 Computing the Standard Deviation of Sample Means Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.

More information

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5 Sectio 13 Kolmogorov-Smirov test. Suppose that we have a i.i.d. sample X 1,..., X with some ukow distributio P ad we would like to test the hypothesis that P is equal to a particular distributio P 0, i.e.

More information

Optimal Adaptive Bandwidth Monitoring for QoS Based Retrieval

Optimal Adaptive Bandwidth Monitoring for QoS Based Retrieval 1 Optimal Adaptive Badwidth Moitorig for QoS Based Retrieval Yizhe Yu, Iree Cheg ad Aup Basu (Seior Member) Departmet of Computig Sciece Uiversity of Alberta Edmoto, AB, T6G E8, CAADA {yizhe, aup, li}@cs.ualberta.ca

More information

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean 1 Social Studies 201 October 13, 2004 Note: The examples i these otes may be differet tha used i class. However, the examples are similar ad the methods used are idetical to what was preseted i class.

More information

Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT

Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT Keywords: project maagemet, resource allocatio, etwork plaig Vladimir N Burkov, Dmitri A Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT The paper deals with the problems of resource allocatio betwee

More information

PUBLIC RELATIONS PROJECT 2016

PUBLIC RELATIONS PROJECT 2016 PUBLIC RELATIONS PROJECT 2016 The purpose of the Public Relatios Project is to provide a opportuity for the chapter members to demostrate the kowledge ad skills eeded i plaig, orgaizig, implemetig ad evaluatig

More information

INVESTMENT PERFORMANCE COUNCIL (IPC) Guidance Statement on Calculation Methodology

INVESTMENT PERFORMANCE COUNCIL (IPC) Guidance Statement on Calculation Methodology Adoptio Date: 4 March 2004 Effective Date: 1 Jue 2004 Retroactive Applicatio: No Public Commet Period: Aug Nov 2002 INVESTMENT PERFORMANCE COUNCIL (IPC) Preface Guidace Statemet o Calculatio Methodology

More information

Estimating Probability Distributions by Observing Betting Practices

Estimating Probability Distributions by Observing Betting Practices 5th Iteratioal Symposium o Imprecise Probability: Theories ad Applicatios, Prague, Czech Republic, 007 Estimatig Probability Distributios by Observig Bettig Practices Dr C Lych Natioal Uiversity of Irelad,

More information

Convention Paper 6764

Convention Paper 6764 Audio Egieerig Society Covetio Paper 6764 Preseted at the 10th Covetio 006 May 0 3 Paris, Frace This covetio paper has bee reproduced from the author's advace mauscript, without editig, correctios, or

More information

Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE 6.44. The absolute value of the complex number z a bi is

Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE 6.44. The absolute value of the complex number z a bi is 0_0605.qxd /5/05 0:45 AM Page 470 470 Chapter 6 Additioal Topics i Trigoometry 6.5 Trigoometric Form of a Complex Number What you should lear Plot complex umbers i the complex plae ad fid absolute values

More information

LECTURE 13: Cross-validation

LECTURE 13: Cross-validation LECTURE 3: Cross-validatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Three-way data partitioi Itroductio to Patter Aalysis Ricardo Gutierrez-Osua Texas A&M

More information

Study on the application of the software phase-locked loop in tracking and filtering of pulse signal

Study on the application of the software phase-locked loop in tracking and filtering of pulse signal Advaced Sciece ad Techology Letters, pp.31-35 http://dx.doi.org/10.14257/astl.2014.78.06 Study o the applicatio of the software phase-locked loop i trackig ad filterig of pulse sigal Sog Wei Xia 1 (College

More information

COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S 2 CONTROL CHART FOR THE CHANGES IN A PROCESS

COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S 2 CONTROL CHART FOR THE CHANGES IN A PROCESS COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S CONTROL CHART FOR THE CHANGES IN A PROCESS Supraee Lisawadi Departmet of Mathematics ad Statistics, Faculty of Sciece ad Techoology, Thammasat

More information

Lesson 15 ANOVA (analysis of variance)

Lesson 15 ANOVA (analysis of variance) Outlie Variability -betwee group variability -withi group variability -total variability -F-ratio Computatio -sums of squares (betwee/withi/total -degrees of freedom (betwee/withi/total -mea square (betwee/withi

More information

Now here is the important step

Now here is the important step LINEST i Excel The Excel spreadsheet fuctio "liest" is a complete liear least squares curve fittig routie that produces ucertaity estimates for the fit values. There are two ways to access the "liest"

More information

Chair for Network Architectures and Services Institute of Informatics TU München Prof. Carle. Network Security. Chapter 2 Basics

Chair for Network Architectures and Services Institute of Informatics TU München Prof. Carle. Network Security. Chapter 2 Basics Chair for Network Architectures ad Services Istitute of Iformatics TU Müche Prof. Carle Network Security Chapter 2 Basics 2.4 Radom Number Geeratio for Cryptographic Protocols Motivatio It is crucial to

More information

Unicenter TCPaccess FTP Server

Unicenter TCPaccess FTP Server Uiceter TCPaccess FTP Server Release Summary r6.1 SP2 K02213-2E This documetatio ad related computer software program (hereiafter referred to as the Documetatio ) is for the ed user s iformatioal purposes

More information

FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10

FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10 FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10 [C] Commuicatio Measuremet A1. Solve problems that ivolve liear measuremet, usig: SI ad imperial uits of measure estimatio strategies measuremet strategies.

More information

Chapter 5 Unit 1. IET 350 Engineering Economics. Learning Objectives Chapter 5. Learning Objectives Unit 1. Annual Amount and Gradient Functions

Chapter 5 Unit 1. IET 350 Engineering Economics. Learning Objectives Chapter 5. Learning Objectives Unit 1. Annual Amount and Gradient Functions Chapter 5 Uit Aual Amout ad Gradiet Fuctios IET 350 Egieerig Ecoomics Learig Objectives Chapter 5 Upo completio of this chapter you should uderstad: Calculatig future values from aual amouts. Calculatig

More information

Mathematical goals. Starting points. Materials required. Time needed

Mathematical goals. Starting points. Materials required. Time needed Level A1 of challege: C A1 Mathematical goals Startig poits Materials required Time eeded Iterpretig algebraic expressios To help learers to: traslate betwee words, symbols, tables, ad area represetatios

More information

Ekkehart Schlicht: Economic Surplus and Derived Demand

Ekkehart Schlicht: Economic Surplus and Derived Demand Ekkehart Schlicht: Ecoomic Surplus ad Derived Demad Muich Discussio Paper No. 2006-17 Departmet of Ecoomics Uiversity of Muich Volkswirtschaftliche Fakultät Ludwig-Maximilias-Uiversität Müche Olie at http://epub.ub.ui-mueche.de/940/

More information

Measures of Spread and Boxplots Discrete Math, Section 9.4

Measures of Spread and Boxplots Discrete Math, Section 9.4 Measures of Spread ad Boxplots Discrete Math, Sectio 9.4 We start with a example: Example 1: Comparig Mea ad Media Compute the mea ad media of each data set: S 1 = {4, 6, 8, 10, 1, 14, 16} S = {4, 7, 9,

More information

Research Article Sign Data Derivative Recovery

Research Article Sign Data Derivative Recovery Iteratioal Scholarly Research Network ISRN Applied Mathematics Volume 0, Article ID 63070, 7 pages doi:0.540/0/63070 Research Article Sig Data Derivative Recovery L. M. Housto, G. A. Glass, ad A. D. Dymikov

More information

Matrix Model of Trust Management in P2P Networks

Matrix Model of Trust Management in P2P Networks Matrix Model of Trust Maagemet i P2P Networks Miroslav Novotý, Filip Zavoral Faculty of Mathematics ad Physics Charles Uiversity Prague, Czech Republic miroslav.ovoty@mff.cui.cz Abstract The trust maagemet

More information

Sequences and Series

Sequences and Series CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their

More information

1 Correlation and Regression Analysis

1 Correlation and Regression Analysis 1 Correlatio ad Regressio Aalysis I this sectio we will be ivestigatig the relatioship betwee two cotiuous variable, such as height ad weight, the cocetratio of a ijected drug ad heart rate, or the cosumptio

More information

3. Greatest Common Divisor - Least Common Multiple

3. Greatest Common Divisor - Least Common Multiple 3 Greatest Commo Divisor - Least Commo Multiple Defiitio 31: The greatest commo divisor of two atural umbers a ad b is the largest atural umber c which divides both a ad b We deote the greatest commo gcd

More information

hp calculators HP 12C Statistics - average and standard deviation Average and standard deviation concepts HP12C average and standard deviation

hp calculators HP 12C Statistics - average and standard deviation Average and standard deviation concepts HP12C average and standard deviation HP 1C Statistics - average ad stadard deviatio Average ad stadard deviatio cocepts HP1C average ad stadard deviatio Practice calculatig averages ad stadard deviatios with oe or two variables HP 1C Statistics

More information

NEW HIGH PERFORMANCE COMPUTATIONAL METHODS FOR MORTGAGES AND ANNUITIES. Yuri Shestopaloff,

NEW HIGH PERFORMANCE COMPUTATIONAL METHODS FOR MORTGAGES AND ANNUITIES. Yuri Shestopaloff, NEW HIGH PERFORMNCE COMPUTTIONL METHODS FOR MORTGGES ND NNUITIES Yuri Shestopaloff, Geerally, mortgage ad auity equatios do ot have aalytical solutios for ukow iterest rate, which has to be foud usig umerical

More information

Evaluation of Different Fitness Functions for the Evolutionary Testing of an Autonomous Parking System

Evaluation of Different Fitness Functions for the Evolutionary Testing of an Autonomous Parking System Evaluatio of Differet Fitess Fuctios for the Evolutioary Testig of a Autoomous Parkig System Joachim Wegeer 1, Oliver Bühler 2 1 DaimlerChrysler AG, Research ad Techology, Alt-Moabit 96 a, D-1559 Berli,

More information

CS103X: Discrete Structures Homework 4 Solutions

CS103X: Discrete Structures Homework 4 Solutions CS103X: Discrete Structures Homewor 4 Solutios Due February 22, 2008 Exercise 1 10 poits. Silico Valley questios: a How may possible six-figure salaries i whole dollar amouts are there that cotai at least

More information

HCL Dynamic Spiking Protocol

HCL Dynamic Spiking Protocol ELI LILLY AND COMPANY TIPPECANOE LABORATORIES LAFAYETTE, IN Revisio 2.0 TABLE OF CONTENTS REVISION HISTORY... 2. REVISION.0... 2.2 REVISION 2.0... 2 2 OVERVIEW... 3 3 DEFINITIONS... 5 4 EQUIPMENT... 7

More information

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring No-life isurace mathematics Nils F. Haavardsso, Uiversity of Oslo ad DNB Skadeforsikrig Mai issues so far Why does isurace work? How is risk premium defied ad why is it importat? How ca claim frequecy

More information

Domain 1 - Describe Cisco VoIP Implementations

Domain 1 - Describe Cisco VoIP Implementations Maual ONT (642-8) 1-800-418-6789 Domai 1 - Describe Cisco VoIP Implemetatios Advatages of VoIP Over Traditioal Switches Voice over IP etworks have may advatages over traditioal circuit switched voice etworks.

More information

Asymptotic Growth of Functions

Asymptotic Growth of Functions CMPS Itroductio to Aalysis of Algorithms Fall 3 Asymptotic Growth of Fuctios We itroduce several types of asymptotic otatio which are used to compare the performace ad efficiecy of algorithms As we ll

More information

Determining the sample size

Determining the sample size Determiig the sample size Oe of the most commo questios ay statisticia gets asked is How large a sample size do I eed? Researchers are ofte surprised to fid out that the aswer depeds o a umber of factors

More information

How to read A Mutual Fund shareholder report

How to read A Mutual Fund shareholder report Ivestor BulletI How to read A Mutual Fud shareholder report The SEC s Office of Ivestor Educatio ad Advocacy is issuig this Ivestor Bulleti to educate idividual ivestors about mutual fud shareholder reports.

More information

Measuring Magneto Energy Output and Inductance Revision 1

Measuring Magneto Energy Output and Inductance Revision 1 Measurig Mageto Eergy Output ad Iductace evisio Itroductio A mageto is fudametally a iductor that is mechaically charged with a iitial curret value. That iitial curret is produced by movemet of the rotor

More information

Forecasting. Forecasting Application. Practical Forecasting. Chapter 7 OVERVIEW KEY CONCEPTS. Chapter 7. Chapter 7

Forecasting. Forecasting Application. Practical Forecasting. Chapter 7 OVERVIEW KEY CONCEPTS. Chapter 7. Chapter 7 Forecastig Chapter 7 Chapter 7 OVERVIEW Forecastig Applicatios Qualitative Aalysis Tred Aalysis ad Projectio Busiess Cycle Expoetial Smoothig Ecoometric Forecastig Judgig Forecast Reliability Choosig the

More information

Confidence Intervals for One Mean

Confidence Intervals for One Mean Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a

More information

Engineering Data Management

Engineering Data Management BaaERP 5.0c Maufacturig Egieerig Data Maagemet Module Procedure UP128A US Documetiformatio Documet Documet code : UP128A US Documet group : User Documetatio Documet title : Egieerig Data Maagemet Applicatio/Package

More information

Recovery time guaranteed heuristic routing for improving computation complexity in survivable WDM networks

Recovery time guaranteed heuristic routing for improving computation complexity in survivable WDM networks Computer Commuicatios 30 (2007) 1331 1336 wwwelseviercom/locate/comcom Recovery time guarateed heuristic routig for improvig computatio complexity i survivable WDM etworks Lei Guo * College of Iformatio

More information

ADAPTIVE NETWORKS SAFETY CONTROL ON FUZZY LOGIC

ADAPTIVE NETWORKS SAFETY CONTROL ON FUZZY LOGIC 8 th Iteratioal Coferece o DEVELOPMENT AND APPLICATION SYSTEMS S u c e a v a, R o m a i a, M a y 25 27, 2 6 ADAPTIVE NETWORKS SAFETY CONTROL ON FUZZY LOGIC Vadim MUKHIN 1, Elea PAVLENKO 2 Natioal Techical

More information

SYSTEM INFO. MDK - Multifunctional Digital Communications System. Efficient Solutions for Information and Safety

SYSTEM INFO. MDK - Multifunctional Digital Communications System. Efficient Solutions for Information and Safety Commuicatios Systems for Itercom, PA, Emergecy Call ad Telecommuicatios MDK - Multifuctioal Digital Commuicatios System SYSTEM INFO ms NEUMANN ELEKTRONIK GmbH Efficiet Solutios for Iformatio ad Safety

More information

Baan Service Master Data Management

Baan Service Master Data Management Baa Service Master Data Maagemet Module Procedure UP069A US Documetiformatio Documet Documet code : UP069A US Documet group : User Documetatio Documet title : Master Data Maagemet Applicatio/Package :

More information

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return EVALUATING ALTERNATIVE CAPITAL INVESTMENT PROGRAMS By Ke D. Duft, Extesio Ecoomist I the March 98 issue of this publicatio we reviewed the procedure by which a capital ivestmet project was assessed. The

More information

Chapter 5: Inner Product Spaces

Chapter 5: Inner Product Spaces Chapter 5: Ier Product Spaces Chapter 5: Ier Product Spaces SECION A Itroductio to Ier Product Spaces By the ed of this sectio you will be able to uderstad what is meat by a ier product space give examples

More information

Desktop Management. Desktop Management Tools

Desktop Management. Desktop Management Tools Desktop Maagemet 9 Desktop Maagemet Tools Mac OS X icludes three desktop maagemet tools that you might fid helpful to work more efficietly ad productively: u Stacks puts expadable folders i the Dock. Clickig

More information

CREATIVE MARKETING PROJECT 2016

CREATIVE MARKETING PROJECT 2016 CREATIVE MARKETING PROJECT 2016 The Creative Marketig Project is a chapter project that develops i chapter members a aalytical ad creative approach to the marketig process, actively egages chapter members

More information

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles The followig eample will help us uderstad The Samplig Distributio of the Mea Review: The populatio is the etire collectio of all idividuals or objects of iterest The sample is the portio of the populatio

More information

A Guide to the Pricing Conventions of SFE Interest Rate Products

A Guide to the Pricing Conventions of SFE Interest Rate Products A Guide to the Pricig Covetios of SFE Iterest Rate Products SFE 30 Day Iterbak Cash Rate Futures Physical 90 Day Bak Bills SFE 90 Day Bak Bill Futures SFE 90 Day Bak Bill Futures Tick Value Calculatios

More information