Franck Cappello and Daniel Etiemble LRI, Université Paris-Sud, 91405, Orsay, France

Size: px
Start display at page:

Download "Franck Cappello and Daniel Etiemble LRI, Université Paris-Sud, 91405, Orsay, France Email: fci@lri.fr"

Transcription

1 MPI versus MPI+OenMP on the IBM SP for the NAS Benchmarks Franck Caello and Daniel Etiemble LRI, Université Paris-Sud, 945, Orsay, France Abstract The hybrid memory model of clusters of multirocessors raises two issues: rogramming model and erformance. Many arallel rograms have been written by using the MPI standard. To evaluate the ertinence of hybrid models for existing MPI codes, we comare a unified model (MPI) and a hybrid one (OenMP fine grain arallelization after rofiling) for the NAS 2.3 benchmarks on two IBM SP systems. The sueriority of one model deends on ) the level of shared memory model arallelization, 2) the communication atterns and 3) the memory access atterns. The relative seeds of the main architecture comonents (CPU, memory, and network) are of tremendous imortance for selecting one model. With the used hybrid model, our results show that a unified MPI aroach is better for most of the benchmarks. The hybrid aroach becomes better only when fast rocessors make the communication erformance significant and the level of arallelization is sufficient. Introduction Some rimary suercomuter manufacturers like IBM and Comaq use CLUsters of MUltiProcessors (CLUMP) to rovide scalable high erformance comuters. The future ASCI White machine corresonds to this architecture with 52 6-way SMP nodes. CLUMPs may use different memory models for the interconnection of the SMP nodes: NUMA or Message-Passing. The SGI Origin comuters use the NUMA aroach as well as the Convex Exemlar ones. The IBM SPs and the Comaq SC Clusters use message-assing. In this aer, we focus on CLUMPs with a hybrid memory model (shared memory inside nodes and message assing between nodes) at the hardware level. Two main issues should be addressed for selecting a rogramming model: the ease of use and erformance. First, we must choose between a unified rogramming model or a hybrid one. In the unified rogramming models, the rogrammer uses a single API to describe the communications inside the multirocessor nodes and between them. All the message assing or DSM systems belong to this category. The hybrid memory models mix shared memory inside the multirocessor and message assing between the nodes. MPI+OenMP, MPI+threads are two hybrid models. Because the hybrid rogramming models require managing two different memory models, they are much more comlex for the rogrammer. Also, many existing alications use a MPI imlementation. For these alications, the relevance of the hybrid rogramming models must be carefully examined because, as we will demonstrate, even otimized hybrid codes may rovide insignificant erformance imrovement comared to the original MPI version. The second issue is erformance. It deends on at least three arameters: a) the sharing of communication suort (memory system and network interface) between the rocessors for the unified model i.e. how the er rocessor latency and bandwidth evolve when several rocessors use a same network interface (and rotocol), b) the degree of shared memory arallelism for the hybrid model and c) the seed-u on the arallel section for both models that can be different because loo nests may be executed in a different way. Section 2 resents the rogramming models and the methodology. Section 3 comares the erformance of MPI unified model and the MPI+ OenMP hybrid model for the NAS 2.3 Benchmarks. We resent detailed results for different cluster sizes and data set sizes (CLASS A and CLASS B) in section 4. Section 5 resents related works. Finally, section 6 concludes. 2 Programming models and methodology In this aer, we focus on existing MPI rograms that have been develoed for traditional arallel machines. Because most of the manufacturers rovide extended versions of their communication library for clusters of multirocessors, existing MPI codes can be directly used with a unified MPI model. The alternative is mixing MPI with a shared /2/$. (c) 2 IEEE.

2 memory model such as OenMP. In that case, different ossibilities exist, which must be comared according to the erformance and rogramming effort trade-off. 2. Programming efforts for mixing MPI + OenMP 2.. Fine-grain arallelization From an existing MPI code, the simlest aroach is the incremental one: it consists in OenMP arallelization of the loo nests in the comutation art of the MPI code. This aroach is also called OenMP fine-grain or loo level arallelization. Several otions can be used according to ) the rogramming effort and 2) the choice of the loo nests to arallelize. Several levels of rogramming effort can be used. First ossibility consists in arallelizing loo nests in the comutation art of the MPI code without any manual otimization. Only the correctness of the arallel version versus the sequential version semantic is checked. But the incremental aroach can be significantly imroved by alying several manual otimizations (loo ermutation, loo exchange, use of temorary variables). These otimizations are required ) to transform non arallel loo nests into arallel ones and 2) to imrove the arallel efficiency by avoiding false sharing or reducing the number of synchronization oints (critical sections, barriers). Another issue is the choice of the loo nests to arallelize. One otion is to arallelize all loo nests. This otion has two drawbacks: it increases the rogramming effort and the arallelization of loo nests that doesn t contribute significantly to the global execution time can be counterroductive. The alternative consists in selecting by rofiling the loo nests that contribute significantly to the global execution time. In this work, we have selected the loos to arallelize according to the framework shown in figure. The comlete descrition of intra-node arallelization rocess for the NAS arallel benchmarks is described in []. MPI code Execute with a rofiler Build the calling grah Select loo to arallelize and insert directives Candidate MPI+OenMP code for SMP nodes Remove or refine arallelization Check semantic and seedu False or Slow d own Correct Figure. The arallelization framework 2..2 Coarse-grain arallelization Correct MPI+OenMP code for SMP nodes Instead of alying a two level arallelization (rocess level and loo level), another currently investigated aroach is the coarse-grain OenMP SPMD arallelization. In this aroach, OenMP is still used to take advantage of the shared memory inside the SMP nodes but a SPMD rogramming style is used instead of the traditional shared memory multithread aroach. In this mode, OenMP is used to sawn N threads in the main rogram, having each thread act similarly to a MPI rocess. The OenMP Parallel directive is used at the outermost level of the rogram. The rincile is to sawn the threads just after the sawn of the MPI rocesses (some initializations may searate the two sawns). As for the message assing SPMD aroach, the rogrammer must take care of several issues: array distribution among threads, work distribution among threads and coordination between threads. Since the array distribution is done assuming a shared memory, the distribution of the arrays only concerns the attribution of different array regions to the different running threads. For maximum erformance, these regions should not overla for write references. The work distribution is made according to the array distribution. Tyically, the OenMP DO directive is not used for distributing the loo iterations among threads. Instead, the rogrammer inserts some calculations of the loo boundaries that deends on the thread number. Coordinating the threads involves managing critical sections (I/O, MPI calls) using either OenMP directives like Master or thread library calls like om get thread num() to guard conditional statements. Few results have been ublished on the coarse-grain mixed OenMP+MPI arallelization. In this aer, we only consider the fine grain incremental aroach including manual otimizations and rofiling to choice the loo nests to arallelize. The comarison results between MPI and MPI+OenMP erformance resented in this aer are only valid for this aroach. 2.2 MPI versus MPI+OenMP MPI The MPI rocesses within nodes communicates through the memory (MP-SHARED-MEMORY otion on the IBM SP) without using the network interface. The user sace mode has been used for highest ossible erformance. With this model, existing MPI codes run without modification (excet for time measurements). MPI+OenMP Parallelizing an alication for the SPMD MPI aradigm often roduces a rogram with the tyical layout resented in the left art of figure 2. It corresonds to the arallelization for a cluster of unirocessor nodes, where a MPI rocess is allocated on each node. The right art shows how the comutation art of the MPI rocess is slit into threads by using OenMP directives. The number of threads corresonds to the number of CPUs in each node. In Figure 2, P is the art of MPI code that can- 2

3 S P C Sy S Message assing (MPI) seq art init MPI init // art com. comm. sync end // art seq art Message assing + shared memory (OenMP) seq art init SMP art unarallel. com. arallel. com. sync end SMP art P P Pn instructions. The -qcache=auto otion lets the comiler decides whether otimizing or not according to the features of memory hierarchy. For OenMP, the following otions are needed: -qsm=noauto:schedule=static. It means that the comiler must not arallelize automatically, but obey the directives (OenMP or other ones). The scheduling otion indicates how the loo iterations must be distributed among the different threads. The static otion without argument means a block distribution with a block size of I/n, where I is the number of loo iterations and n is the number of arallel threads that will run concurrently. Table details the bidirectional communication erformance (asynchronous echo test) with WH2 nodes. Note that no comiler otimizations were used for these measures. Figure 2. Parallelizing a MPI code with OenMP by using the fine grain aroach: the comutation art of the MPI code on a node is arallelized with OenMP directives. not be arallelized with OenMP and Pn is the OenMP arallel art. 2.3 IBM SP systems Two IBM SP systems were used for the exeriments because they exhibit two different balances between the erformance of the main comonents (CPU, Memory, Network) erformance. The first system gathers 32 Winter- Hawk II (WH2) multirocessors with 4 Power3+ CPUs er node running at 375 MHz. Memory system for these nodes is based on a bus with a.6 GB/s maximum bandwidth. The second system connects 8 NightHawk I (NH) multirocessors based on Power3 CPUs running at 222 MHz. Each NH multirocessor rovides a theoretical 4.2 GB/s eak memory bandwidth under secific assumtions (memory bank oulation). The Power3 CPU has a 64 KB data cache. The unified L2 cache sizes are resectively 8 MB (WH2 nodes) and 4 MB (NH nodes) er CPU. The multi-stage interconnection network is the same for both SP systems. Each multirocessor node has one network interface. The maximum er node network bandwidth is 5-MB/sec unidirectional and 3-MB/sec bidirectional. The SP software environment includes the AIX 4.3 oerating system, the XLF 6. Fortran comiler that includes OenMP directives and the PPE 2.4 Parallel Environment (intra-node MPI communications use the shared memory). The following comiler otions have been used: -O3 -qarch=wr3 -qtune=w3 -qcache=auto for MPI. The Power3 microrocessor otions allow the use of refetch Unirocessor 4-way node 4-way node External comm. Internal comm. Bandwidth 93 MB/s 49 MB/s 36 MB/s ( comm.) (er CPU) 74 Mo/s (2 comm.) Latency ( comm.) 8.4 (2 comm.) Table. Communication erformance for bidirectional oint to oint tests ( comm. means that only two rocessors are communicating. 2 comm. means that two coules of rocessors are communicating simultaneously). Table shows that each CPU can use /4 of the maximum bandwidth delivered by the network board. The measured standard deviation is very low. It is lower than 5% for the er rocessor communication bandwidth when 4 CPUs from one node communicate with 4 CPUs of another node (extended echo test for clusters of SMP nodes). For this communication scheme, each CPU exeriences a latency of. For a given number of small messages to send from a node, the total communication time is thus two times smaller with 4 CPUs er node than with only one. The reason is the overla between the comutation arts of the communications. The internal communication erformance is far greater than the external one. Each node has a bus between the CPUs and the memory. This is why the bandwidth decreases by a half for two simultaneous communications within the SMP node. 2.4 Using SMP nodes 2.4. Scalability of the NAS benchmarks with -way nodes Before examinig the scalability of the NAS benchmarks with the two different memory models on clusters of multirocessors, we resent some background results about their scalability on a cluster of unirocessors. The behaviour of 3

4 Seedu Class A Perfect cg ft lu mg bt s Number of -way nodes Seedu Class B Perfect cg ft lu mg bt s Number of -way nodes Figure 3. Seedu for the NAS benchmarks with WH2 nodes. The left art corresonds to Class A and the right art to Class B. the NAS benchmark highly deends on the balance between the main comonents of the arallel architectures. In [2] most of the Class A benchmarks excet IS and FT have a linear seedu u to 32 rocessors on recent architectures. The SGI Origin even rovides suerlinear seedu for some benchmarks. The figure 3 resents the seedu of the Class A and class B benchmarks for the SP3 with WH2 nodes. The observed seedus, which deend on the communication/comutation ratio, are sensitive to the dataset sizes, as shown by CG and LU (sub-linear for Class A and suerlinear for Class B for 32 nodes). Previous results available on the NAS Benchmark Web age generally show sublinear seedus for most of the comuters. [2] resents a dee analysis of the reasons behind the different behaviours of the NAS benchmarks on different machines. The main arameters of the scalability are the sequential node architecture and the erformance of the communication system. The authors demonstrate that although the SGI Origine has a relatively low communication efficiency on the NAS communication atterns, some suerlinear seedus are achievable because of the node architecture (cache, memory system). -way/4-way exec. time ratio CPU 6-CPU 32-CPU cg-a cg-b ft-a ft-b lu-a lu-b mg-a mg-b Figure 4. Ratio of the MPI execution time with -way WH2 nodes over the MPI execution time with 4-way nodes according to the number of CPUs for the class A and class B benchmarks. The values are less than when -way nodes are better way nodes versus 4-way nodes Before going into details of the erformance comarison, we want to confirm the intuition that sharing the memory system and network interface makes a cluster of SMP nodes less efficient than a cluster of unirocessor nodes using the same number of CPUs. On the SP system with 32 WH2 nodes, we have comared the execution times of class A and class B MPI benchmarks when using the same number of CPUs, with either -way nodes or 4-way nodes. The comarison have thus been done with 8, 6 and 32 CPUs. Results are shown in figure 4. The figure shows that the ratio between clusters of -way nodes and clusters of 4-way nodes is always less than for any benchmark and any number of CPUs. Performance ratio clearly deends on the alication. For FT, the ga is accentuated because this benchmark uses global communications, which are resently not highly otimized on the SP systems. The results of the comarison confirms the ones resented in [5] where the authors claim that a cluster of unirocessors can be faster that a cluster of multirocessors. The detailed comarison of these 4

5 two tyes of clusters is out of the scoe of this aer, which aims on comaring rogramming models. More, comaring erformance according to the number of CPUs is very debatable according to the cost issues and the current trends in arallel architectures. Because of the increasing ga between CPU erformance and DRAM erformance, the technological trends are towards larger SMP nodes (NH2 nodes of the IBM SP use 6-way nodes and the future SP4 systems will have 32-way nodes) built as clusters of smaller SMP nodes. This is why we only consider erformance of SMP nodes in the rest of the aer. 4-way nodes are the largest nodes that can be used to comare MPI and MPI+OenMP on the two SP systems that were used. 2.5 Methodology We first alied the incremental aroach to the original MPI NPB-2.3 benchmarks to get the hybrid version. The NAS benchmarks have not been tuned secifically to take advantage of the memory hierarchy of the Power3 for both models. So erformance results do not give the highest ossible value for these codes. However, we try to be fair in comaring codes with quite equivalent otimization levels. Then, the execution time have been decomosed into comutation and communication times. For that, all benchmarks have been instrumented with time measurements. Timings are accumulated across all nodes and divided by the number of nodes to give mean values er rocessor. The communication time includes the communication calls and the synchronization rocedures (Wait). During the exeriments on the IBM latforms and u to now, no software was available to access the hardware erformance counters of the Power3 on the IBM SP. So all interretations of the measured results are based on the timing measurements. We have taken care to limit the interretation to the clearly aarent henomena. A revious study has been done and ublished [3] for a cluster of way Pentium-II nodes. On this latform, we had access to the hardware erformance counters and we could rovide a deeer analysis. 3 Overall comarison between MPI and MPI+OenMP Figure 5 resents the ratio between the MPI execution time and the MPI+OenMP execution time according to the number of 4-way nodes for each class of benchmarks (A or B) and each tye of SP nodes (WH2 and NH). For BT and SP, which use a square number of rocesses, we only use the number of nodes (, 4 and 6) that allows a direct comarison with the other benchmarks, omitting the 9-node configuration. A ratio less than means that the unified MPI version is more efficient than the hybrid version. The comarison results are clearly alicationdeendent. Whatever nodes and data sets are used, the unified MPI model is always better for LU, MG, BT and SP. The advantage is sectacular with LU, for which unified MPI is always more than 2 times more efficient than MPI+OenMP. For CG and FT, the advantage of one model deends on the data set size, on the tye of nodes and also on the number of nodes. For NH nodes and class A, MPI shows better erformance for CG and FT for any number of nodes. On the oosite, the hybrid model is always better for FT with WH2 nodes (classes A and B) and for class B (WH2 and NH nodes). It is always better for CG with WH2 nodes and class B. The advantage of one model can deend on the number of nodes: for CG and the WH2-class A and NH-class B configurations, MPI is better for a small number of nodes and MPI+OenMP for a larger number of nodes. In both case, the threshold is low (resectively for 2 and 8 nodes). The results are quite similar for the classes A and B, but they are slightly more favorable to the hybrid model for the class B. The results are also slightly more favorable to the hybrid model when using WH2 nodes instead of NH nodes. These results show that the relative erformance of each model deends on the alication, the dataset size and the features of the different comonents of the architecture (CPU, Memory system, Interconnection system). 4 Understanding the erformance results To exlain the results of the revious section, we have decomosed the execution time into comutation and communication times. According to Figure 2, we have further decomosed the comutation time into S+P and Pn, where S, P and Pn are resectively the sequential time, the MPI arallel time (that cannot be arallelized with OenMP) and the comutation time that can be arallelized with OenMP. 4. Amdahl s law for the hybrid model For constant size roblems, when using the MPI unified model with N 4-way nodes, 4N CPUs will run concurrently all the code arts excet the sequential and the communication arts. For the hybrid model with N 4-way nodes, only the Pn art of the rogram can be accelerated. Figure 6 resents the Pn execution time/total execution time ratio according to the number of -way nodes to evaluate the art of the execution time that can effectively benefit from the OenMP arallelization on SMP nodes. The results corresond to Class A and Class B benchmarks. Clearly, the time sent on the sequential sections and the MPI arallel sections that cannot be arallelized with OenMP and the communication time limit the fraction of 5

6 WH2 4-way nodes - Class A WH2 4-way nodes - Class B MPI /MPI+OMP cg ft lu mg bt s MPI / MPI+OMP cg ft lu mg bt s NH 4-way nodes - Class A NH 4-way nodes - Class B MPI / MPI + OenMP cg ft lu mg bt s MPI / MPI + OenMP cg lu mg Figure 5. Ratio of the MPI execution time over the MPI+OenMP execution time according to the number of 4-way nodes. The values are less than when MPI is better. The to diagrams corresond to WH2 nodes, with class A (left) and class B (left). The bottom diagrams corresond to NH nodes. For the NH nodes and class B, some measures are missing. the overall code that can be accelerated with an OenMP arallelization. This alication of Amdahl s law is more significant for the Class A than for the Class B, for which the communication imact is less significant. Its imact is more significant with the most owerful WH2 nodes. For class A and WH2 nodes, CG has the worst behavior, the Pn fraction going down to 25%. Pn fraction decreases from % (CG, FT) or 8% (LU, MG) down to 5 or 7% according to the benchmark and the class. These results are relevant considering the original assumtion that we use OenMP to arallelize existing MPI alications. Also, we should state that a better arallelization could be achieved for some benchmarks like LU using a stronger arallelization effort, but this no longer corresonds to the incremental aroach of arallelizing loo nests, which is the context of this aer. 4.2 Communication and Comutation times To exlain why the hybrid model erforms better on some benchmarks and SP configurations, we have decomosed the execution time into comutation and communication times. Summing searately the comutation times and communication times across all CPUs make easier the resentation of the results. With this assumtion, a erfect seedu would give the same value whatever number of CPUs is used. Class A results with WH2 nodes Figure 7 shows the result for CG, FT, LU and MG in class A with WH2 nodes. Comaring the left and right scales for each benchmark indicates the relative contribution of the communication time on the overall execution time. For all benchmarks excet LU, MPI communication 6

7 WH2 nodes NH nodes Pn exec.time/totalexec. time cg-a cg-b ft - A ft - B lu - A lu - B mg-a mg-b Pn exec. time/ total exec. time cg-a cg-b ft-a ft-b lu-a lu-b mg-a mg-b Figure 6. Fraction of the execution time that can be accelerated with OenMP on clusters of -way nodes according to the number of nodes for Class A and B benchmarks. Com-mi Comm-mi Com-mi+oenm Comm-mi+oenm Com-mi Comm-mi Com-mi+oenm Comm-mi+oenm CG_A Com. time (sec) Number of WH2 nodes CG_A Comm. time (sec) FT_A time (sec) Number of WH2 nodes LU-A Com. time (sec) Com-mi Comm-mi Number of WH2 nodes Com-mi+oenm Comm-mi+oenm LU_A Comm. time (sec) MG_A Com. time (sec) Com-mi Comm-mi Number of WH2 nodes Com-mi+oenm Comm-mi+oenm MG_A Comm. time (sec) Figure 7. Time Breakdown for MG, CG, LU and FT MPI and MPI+OenMP version according to the number of 4-way WH2 nodes. The figures breakdown the total execution time, summed across all the rocessors, into comutation times (shown as bars with the left scale) and communication times (lines with the right scale). 7

8 times are larger than for MPI+OenMP. The difference increases significantly from 4 nodes. To exlain this difference we should state that the two models use different communication atterns for the same number of CPUs. With N 4-way nodes, there are communications between 4N rocesses for the MPI model oosed to communications between N rocesses for the MPI+OenMP model. For constant size roblems, Won et al. [2] have shown for the NAS benchmarks that message size generally decreases when the number of nodes increases, but the number of messages sent and received by each node is likely to increase as more nodes lead to more comlicated communication atterns. These communication atterns and erformance (latency and bandwidth) exlain the behavior of CG. Figure 8 resents the main communication atterns for CG when using 2, 4, 8 and 6 rocessors. The small white boxes are for the rocessors. The grey rectangles corresond to the maing of communications on 4-way nodes. The external communications are between the grey boxes. In each art of the figure, communications are decomosed into small and large messages, with the number of each tye (number * 64-bit words). For CG, the long messages dominates the communication time. When increasing the number of rocesses, CG exchanges more long messages but the messages are shorter. As shown in table, the bandwidth when using a single MPI rocess er node is similar to the aggregate bandwidth when using 4 MPI rocesses er node. So, only the number of long messages gives advantage to the MPI+OenMP * 46 * * 46 * * 46 * * 46 * 35 Figure 8. Communication atterns for CG using 2, 4, 8 and 6 rocessors. The comarison of the communication time for LU gives the oosite result. The unified MPI aroach rovides a lower communication time than the hybrid aroach. Note that the communication time also encomasses the synchronization of the running rocesses. In the original code, there is no way to searate the communication and the synchronization times. LU uses many more small ( KB) messages with blocking communication calls than large messages with asynchronous communication calls. Only considering the extended echo test (for cluster of multirocessors) for synchronous communications cannot exlain the oosite behavior of LU comared to CG. Since ) the communication latency for the four rocesses inside a SMP node (unified aroach) is two times higher ( for 64 rocesses) than the communication latency ( for 6 rocesses) for a single rocess node (hybrid aroach) and 2) the communication time for the hybrid aroach ( ) stays close to the communication time ( ) for a void message (the latency is the dominant factor) although the hybrid aroach uses larger messages (2 times) than the unified aroach, the communication attern should give the advantage to the hybrid aroach. The oosite result suggests that the synchronization time is higher for the hybrid aroach than for the unified one. For all benchmarks, the MPI comutation times are less than the MPI+OenMP ones. The difference is relatively small for FT and CG. For MG, it becomes significant for a large number of nodes. For LU, the MPI comutation time is more than two times smaller than the MPI+OenMP one. Three factors contribute to the difference in comutation time between the two versions. First one is the Amdahl s law on the OenMP arallel art that was mentioned in the revious section. Second one is the data layout, which may be different when a rocess is slit into 4 threads by an OenMP arallelization or when there are 4 MPI rocesses within a node. This difference imacts on the cache behaviour, secially when MPI rogramming allows to exress multi-dimensional blocking. A third factor is the overhead of thread management in the OenMP version. Next section on memory access atterns and arallel efficiency on the OenMP arallel art will give more information to understand the large difference on LU comutation times between the two versions. Combining comutation and communication values exlain the overall results. The MPI aroach with LU, which has both better comutation and communication erformance, outerforms the hybrid aroach. For MG, the comutation advantage of MPI always comensates the communication disadvantage. For CG, the comutation advantage of MPI comensates the communication disadvantage only for a small number of nodes. For FT, communication disadvantage of the MPI aroach associated with the more comlicated all to all communication atterns leads to better erformance with the hybrid aroach. As reviously mentioned, the global 8

9 communications on SP3 systems are not otimized with SMP nodes. Other configurations For LU and MG, the decomosition of the execution time gives very similar results for the 3 other configurations: WH2-class B, NH-class A and NH- class B. We resent now for each configuration the time decomosition of CG (figure 9). Like FT, CG is a benchmark for which the best rogramming model deends on the configuration, the dataset size and the number of nodes. For class A with WH2 nodes, as reviously exlained, the comutation advantage of MPI comensates the communication disadvantage u to 4 nodes, but not for a larger number of nodes. With NH nodes, the comutation for both models are larger than with WH2 nodes (according to resective clock frequency) and the communication time are also slightly larger: the situation is similar, excet that MPI kees advantage until 8 nodes, but measures with 6 NH nodes would give advantage to the hybrid model. For class B, the clock frequency advantage of WH2 nodes only aears for 4 or 8 nodes, robably because some cache effects become significant. For less nodes, the frequency advantage is counterbalanced by the worse erformance of the memory system (bus versus crossbar). With NH nodes, the ga between the MPI communication time and the MPI+OenMP communication time increases quickly and only the configuration with one node shows an advantage for the MPI model. Remember that the communication time includes the synchronization time. We have no clear exlanation for the significant difference in MPI communication times between the two hardware systems for the same number of nodes. One reason might be that the communication time encomasses a CPU art and a Network Interface Communication art. In that case, slower CPUs would lead to larger communication times. Unfortunately, no hardware monitoring counters are available on the SP systems. They would be needed to recisely examine the behaviour of the memory hierarchy for each hardware configuration and benchmark class. 4.3 Memory access atterns The Amdahl s law is not sufficient to exlain the difference of comutation times between the two models. As for the communication atterns, the memory access atterns are different for the two models. To go further, we examine the arallel efficiency, which is defined as the measured seedu divided by the number of CPUs, on the Pn art of the comutation (art that can be arallelized with OenMP). Note that Pn art corresonds to the same code for both models. However loo nests are executed differently by each model. Figure shows the arallel efficiency on the Pn art for the class A and class B benchmarks with WH2 nodes. FT and MG exhibits similar behavior: the arallel efficiency is aroximately the same for the two aroaches. For FT, it increases with the number of nodes for small dataset sizes, which corresond to a better use of caches or remains constant if the dataset size is too large for the cache hierarchy. For MG, the arallel efficiency increases when the number of nodes increases above a threshold. For CG and LU, the MPI version has better arallel efficiency. For CG, the arallel efficiency increases with the number of nodes for large dataset sizes and reaches a maximum with 4 nodes and then decreases: the dataset size is too small for a full exloitation of the cache hierarchy. This last situation exists for LU for class A and class B. The decomosition of the comutation time for the hybrid version shows that two arameters govern its erformance: ) the time sent on the core of the loo nests and 2) the overhead of thread management. Although, the hybrid version reaches lower comutation time on loo cores for some benchmarks, it has higher comutation time for the whole loo nests due to the overhead of the thread management. At this oint, we should state that there is no reason why the hybrid aroach erforms better on some loo nests excet because loo nest otimizations used in the original MPI version do not match well the Power3 memory hierarchy. Conversely, we should note that the OenMP arallelization has a severe limitation comared to the MPI version: OenMP cannot exress multi-dimensional blocking when using the easiest arallelization aroach or simle loo otimizations. Such kinds of atterns require a comlete rewriting of the loo nest (loo fusion), which means a time consuming rogramming effort. 5 Related works The erformance of the IBM SP with Winter Hawk II nodes is resented in [4]. The author comares this architecture to a AlhaServer SC for several kernels and alications that do not include the NAS benchmarks. The aer does not comare unified and hybrid versions. The erformance issues of the NAS benchmarks have been resented in [2]. Esecially, Wong et al. have examined the architectural requirements and scalability. This work is restricted to unirocessor nodes and comares erformance of a cluster of workstations and the SGI Origin 2 machine. The erformance of Message Passing has been detailed in [5] for a cluster of SUN SMPs. Authors note that the memory hierarchy and esecially the oulation of the memory banks have a significant imact on the erformance. The aer does not address the issue of hybrid rogramming models. A comarison of shared memory and message assing 9

10 Com-mi-w Com-mi-n Comm-mi-w Comm-mi-n Com-mi+oenm-w Com-mi+oenm-n Comm-mi+oenm-w Comm-mi+oenm-n Com-mi-w Com-mi-n Comm-mi-w Comm-mi-n Com-mi+oenm-w Com-mi+oenm-n Comm-mi+oenm-w Comm-mi+oenm-n CG_A times (sec) Number of nodes CG_B times (sec) Figure 9. The execution time breakdown for CG according to the number of 4-way nodes for the MPI and MPI+OenMP versions. The left art corresonds to the Class A and the right art to the class B. w (res. n) stands for WH2 (res. NH) nodes. The figures breakdown the total execution time, summed across all the rocessors, into comutation times (shown as bars) and communication time (lines). The left scale is the same for comutation and communication times. Class A Class B Parallel Efficiency on Pn art cg cg ft ft lu lu mg mg Parallel efficiency on Pn art cg cg ft ft lu lu mg mg Figure. Parallel efficiency on the Pn art according to the number of nodes for the Class A (left art) and the Class B benchmarks (right art). tag corresonds to the unified MPI version with 4-way WH2 nodes. tag corresonds to the hybrid version with 4-way WH2 nodes. memory models has been reviously resented in [6]. The aer focuses on classical arallel architectures and does not examine the CLUMP issues. Several aers in the literature resent the imlementation of alications using a hybrid model like [7]. None of them rovides a dee understanding of why this model is better or worst than a unified one. Other rogramming models have been designed for the CLUMPs. For examle, the shared virtual memory environments (DSVM) rovide another alternative to unify the memory model, as resented in [8] [9] [][]. Recently, OenMP [2] has been imlemented on a cluster of SMPs on to of the Treadmark DSM system. As far as we know, there is no comarison between these unified aroaches and some hybrid ones. 6 Concluding remarks Based on the analysis of the NAS benchmarks running on two different IBM SP systems, we have demonstrated that the choice between unified or hybrid rogramming models is not trivial from the MPI existing codes. Several arameters must be considered. The first one is the level of shared memory arallelization achievable for

11 the loo nests of the original MPI rogram. From the original MPI rogram, it is difficult to evaluate if the incremental aroach that arallelizes the loo nests with OenMP (including secific otimizations) is sufficient to imrove erformance or if it is needed to reconsider the arallelization of the alication from scratch. The second arameter concerns the communication cost. For the same number of rocessors, this cost deends on how the communication atterns of the alication match the communication architecture of the cluster. Generally, clusters of multirocessors erform better with unified MPI aroach for latency sensitive rograms and worse for bandwidth sensitive rograms. A third arameter is the memory access atterns used by each aroach. Whereas MPI rogramming allows to exress multi-dimensional blocking, it is not natural for OenMP to do so. To erform the same memory atterns, loo nests must be rewritten which may be comlex for long loo core or dee loo nests. Efforts for imroving OenMP may rovide a solution to this roblem. Other rogramming languages like Co-array [3] may also be considered as alternative to OenMP for the hybrid rogramming or even for the unified rogramming. The last arameter is the erformance balance of the main comonents (CPUs, Memory, Network) of the CLUMP. For the same network erformance, fast rocessors may reduce the comutation time comared to slow rocessors u to the oint that communication erformance becomes significant. In that case the communication erformance allows to select the right model. In the other case, MPI seems to be always the best. Note that that our results and analysis only concerns a articular aroach of hybrid rogramming, which consists in loo level arallelization with a reasonable rogramming effort after rofiling to select the loo nests to arallelize. Other hybrid aroaches may lead to different results and analysis conclusions. We believe that selecting the aroriate model is a general roblem that is not secific to the alications analyzed in the aer. First, NAS benchmarks exhibit a large set of communication, memory reference and comutation atterns. Excet some irregular ones, real alications are likely to encomass similar atterns. Second, we have demonstrated that several imortant arameters distinguish the erformance of both aroaches. Although the relative imact of these arameters is alication deendent, they exist in any arallel alication. Future work will include erforming the same analysis for other clusters of multirocessors like Comaq SC, comaring with the other hybrid aroaches and esecially the coarse grain arallelization and deriving a general erformance model that may hel choosing between unified and hybrid models according to the characteristics of the architecture and the alication. 7 Acknowledgments Alain Quere and Eric Boyer (CINES in Montellier) allowed the first set of measurements on the SP3 with NH nodes as a single user with user sace communications. Olivier Hess, from IBM Montellier, was very helful for all technical roblems. John Levesque and its grou in the IBM Advanced Comuting Technology Center allowed the set of measurements on the SP3 with WH2 nodes. References [] F. Caello and O. Richard. Intra-node arallelization of MPI rograms with OenMP Reort TR-CAP-99, URL: htt:// fci/goinfrewww/96.s.gz [2] F. C. Wong, R. P. Martin, R. H. Araci-Dusseau, D. T. Wu, and D. E. Culler. Architectural requirements and scalability of the NAS arallel benchmarks. In Proceedings of Suercomuting 99, 999. [3] F. Caello, O. Richard O and D. Etiemble. Investigating the erformance of two rogramming models for clusters of SMP PCs In Proc. of the 6th IEEE Sym. on High-Performance Comuter Architecture (HPCA-6). January 2. [4] P. H. Worley. Performance Evaluation of the IBM SP and Comaq AlhaServer SC. In Proc. of the International Conference on Suercomuting, ICS, May 2 [5] Steven S. Lumetta, Alan Mainwaring, and David E. Culler. Multi-rotocol active messages on a cluster of SMPs. In SC 97: High Performance Networking and Comuting: Proceedings of the 997 ACM/IEEE SC97 Conference: November 5 2, 997, San Jose, California, USA., 997. [6] H. Shan and J. P. Singh. A Comarison of MPI, SHMEM, and Cache-Coherent Shared Address Sace Programming Models on the SGI Origin2. In Proc. of the International Conference on Suercomuting, ICS, June 999 [7] S. W. Bova, C. P. Breshears, C. Cuicchi, Z. Demirbilek, H. A. Ga. Dual-level Parallel Analysis of Harbor Wave Resonse Using MPI and OenMP In The International Journal of High Performance Comuting Alications, Sring 2. [8] D. J. Scales, K. Gharachorloo, and A. Aggarwal. Fine-grain software distributed shared memory on

12 SMP clusters. In Proc. of the 4th IEEE Sym. on High-Performance Comuter Architecture (HPCA-4), February 998. [9] R. Stets, S. Dwarkadas, N. Hardavellas, G. Hunt, L. Kontothanassis, S. Parthasarathy, and Michael Scott. Cashmere-2L: Software coherent shared memory on a clustered remote-write network. In Proc. of the 6th ACM Sym. on Oerating Systems Princiles (SOSP-6), October 997. [] R. Samanta, A. Bilas, L. Iftode, and J. P. Singh. Homebased SVM rotocols for SMP clusters: Design and erformance. In Proc. of the 4th IEEE Sym. on High-Performance Comuter Architecture (HPCA-4), February 998. [] Andrew Erlichson, Neal Nuckolls, Greg Chesson, and John Hennessy. SoftFLASH: Analyzing the erformance of clustered distributed virtual shared memory. In Proceedings of the Seventh International Conference on Architectural Suort for Programming Languages and Oerating Systems, ages 2 22, 996. [2] Charlie Hu Honghui Lu Alan Cox and Willy Zwaeneoel. OenMP for Networks of SMPs. In Proc. of the Second Merged Sym. IPPS/SPDP 99, 999. [3] R. W. Numrich and J. K. Reid. Co-Array Fortran for arallel rogramming Research reort RAL-TR-998-6, Deartment for Comutation and Information, Rutherford Aleton Laboratory, Oxon, UK, August 998 2

Web Application Scalability: A Model-Based Approach

Web Application Scalability: A Model-Based Approach Coyright 24, Software Engineering Research and Performance Engineering Services. All rights reserved. Web Alication Scalability: A Model-Based Aroach Lloyd G. Williams, Ph.D. Software Engineering Research

More information

Memory management. Chapter 4: Memory Management. Memory hierarchy. In an ideal world. Basic memory management. Fixed partitions: multiple programs

Memory management. Chapter 4: Memory Management. Memory hierarchy. In an ideal world. Basic memory management. Fixed partitions: multiple programs Memory management Chater : Memory Management Part : Mechanisms for Managing Memory asic management Swaing Virtual Page relacement algorithms Modeling age relacement algorithms Design issues for aging systems

More information

Storage Basics Architecting the Storage Supplemental Handout

Storage Basics Architecting the Storage Supplemental Handout Storage Basics Architecting the Storage Sulemental Handout INTRODUCTION With digital data growing at an exonential rate it has become a requirement for the modern business to store data and analyze it

More information

From Simulation to Experiment: A Case Study on Multiprocessor Task Scheduling

From Simulation to Experiment: A Case Study on Multiprocessor Task Scheduling From to Exeriment: A Case Study on Multirocessor Task Scheduling Sascha Hunold CNRS / LIG Laboratory Grenoble, France sascha.hunold@imag.fr Henri Casanova Det. of Information and Comuter Sciences University

More information

A Virtual Machine Dynamic Migration Scheduling Model Based on MBFD Algorithm

A Virtual Machine Dynamic Migration Scheduling Model Based on MBFD Algorithm International Journal of Comuter Theory and Engineering, Vol. 7, No. 4, August 2015 A Virtual Machine Dynamic Migration Scheduling Model Based on MBFD Algorithm Xin Lu and Zhuanzhuan Zhang Abstract This

More information

Monitoring Frequency of Change By Li Qin

Monitoring Frequency of Change By Li Qin Monitoring Frequency of Change By Li Qin Abstract Control charts are widely used in rocess monitoring roblems. This aer gives a brief review of control charts for monitoring a roortion and some initial

More information

Concurrent Program Synthesis Based on Supervisory Control

Concurrent Program Synthesis Based on Supervisory Control 010 American Control Conference Marriott Waterfront, Baltimore, MD, USA June 30-July 0, 010 ThB07.5 Concurrent Program Synthesis Based on Suervisory Control Marian V. Iordache and Panos J. Antsaklis Abstract

More information

Point Location. Preprocess a planar, polygonal subdivision for point location queries. p = (18, 11)

Point Location. Preprocess a planar, polygonal subdivision for point location queries. p = (18, 11) Point Location Prerocess a lanar, olygonal subdivision for oint location ueries. = (18, 11) Inut is a subdivision S of comlexity n, say, number of edges. uild a data structure on S so that for a uery oint

More information

The impact of metadata implementation on webpage visibility in search engine results (Part II) q

The impact of metadata implementation on webpage visibility in search engine results (Part II) q Information Processing and Management 41 (2005) 691 715 www.elsevier.com/locate/inforoman The imact of metadata imlementation on webage visibility in search engine results (Part II) q Jin Zhang *, Alexandra

More information

Rummage Web Server Tuning Evaluation through Benchmark

Rummage Web Server Tuning Evaluation through Benchmark IJCSNS International Journal of Comuter Science and Network Security, VOL.7 No.9, Setember 27 13 Rummage Web Server Tuning Evaluation through Benchmark (Case study: CLICK, and TIME Parameter) Hiyam S.

More information

An Efficient Method for Improving Backfill Job Scheduling Algorithm in Cluster Computing Systems

An Efficient Method for Improving Backfill Job Scheduling Algorithm in Cluster Computing Systems The International ournal of Soft Comuting and Software Engineering [SCSE], Vol., No., Secial Issue: The Proceeding of International Conference on Soft Comuting and Software Engineering 0 [SCSE ], San Francisco

More information

The risk of using the Q heterogeneity estimator for software engineering experiments

The risk of using the Q heterogeneity estimator for software engineering experiments Dieste, O., Fernández, E., García-Martínez, R., Juristo, N. 11. The risk of using the Q heterogeneity estimator for software engineering exeriments. The risk of using the Q heterogeneity estimator for

More information

An inventory control system for spare parts at a refinery: An empirical comparison of different reorder point methods

An inventory control system for spare parts at a refinery: An empirical comparison of different reorder point methods An inventory control system for sare arts at a refinery: An emirical comarison of different reorder oint methods Eric Porras a*, Rommert Dekker b a Instituto Tecnológico y de Estudios Sueriores de Monterrey,

More information

Buffer Capacity Allocation: A method to QoS support on MPLS networks**

Buffer Capacity Allocation: A method to QoS support on MPLS networks** Buffer Caacity Allocation: A method to QoS suort on MPLS networks** M. K. Huerta * J. J. Padilla X. Hesselbach ϒ R. Fabregat O. Ravelo Abstract This aer describes an otimized model to suort QoS by mean

More information

Machine Learning with Operational Costs

Machine Learning with Operational Costs Journal of Machine Learning Research 14 (2013) 1989-2028 Submitted 12/11; Revised 8/12; Published 7/13 Machine Learning with Oerational Costs Theja Tulabandhula Deartment of Electrical Engineering and

More information

Time-Cost Trade-Offs in Resource-Constraint Project Scheduling Problems with Overlapping Modes

Time-Cost Trade-Offs in Resource-Constraint Project Scheduling Problems with Overlapping Modes Time-Cost Trade-Offs in Resource-Constraint Proect Scheduling Problems with Overlaing Modes François Berthaut Robert Pellerin Nathalie Perrier Adnène Hai February 2011 CIRRELT-2011-10 Bureaux de Montréal

More information

Effect Sizes Based on Means

Effect Sizes Based on Means CHAPTER 4 Effect Sizes Based on Means Introduction Raw (unstardized) mean difference D Stardized mean difference, d g Resonse ratios INTRODUCTION When the studies reort means stard deviations, the referred

More information

DAY-AHEAD ELECTRICITY PRICE FORECASTING BASED ON TIME SERIES MODELS: A COMPARISON

DAY-AHEAD ELECTRICITY PRICE FORECASTING BASED ON TIME SERIES MODELS: A COMPARISON DAY-AHEAD ELECTRICITY PRICE FORECASTING BASED ON TIME SERIES MODELS: A COMPARISON Rosario Esínola, Javier Contreras, Francisco J. Nogales and Antonio J. Conejo E.T.S. de Ingenieros Industriales, Universidad

More information

Load Balancing Mechanism in Agent-based Grid

Load Balancing Mechanism in Agent-based Grid Communications on Advanced Comutational Science with Alications 2016 No. 1 (2016) 57-62 Available online at www.isacs.com/cacsa Volume 2016, Issue 1, Year 2016 Article ID cacsa-00042, 6 Pages doi:10.5899/2016/cacsa-00042

More information

Comparing Dissimilarity Measures for Symbolic Data Analysis

Comparing Dissimilarity Measures for Symbolic Data Analysis Comaring Dissimilarity Measures for Symbolic Data Analysis Donato MALERBA, Floriana ESPOSITO, Vincenzo GIOVIALE and Valentina TAMMA Diartimento di Informatica, University of Bari Via Orabona 4 76 Bari,

More information

Synopsys RURAL ELECTRICATION PLANNING SOFTWARE (LAPER) Rainer Fronius Marc Gratton Electricité de France Research and Development FRANCE

Synopsys RURAL ELECTRICATION PLANNING SOFTWARE (LAPER) Rainer Fronius Marc Gratton Electricité de France Research and Development FRANCE RURAL ELECTRICATION PLANNING SOFTWARE (LAPER) Rainer Fronius Marc Gratton Electricité de France Research and Develoment FRANCE Synosys There is no doubt left about the benefit of electrication and subsequently

More information

Secure synthesis and activation of protocol translation agents

Secure synthesis and activation of protocol translation agents Home Search Collections Journals About Contact us My IOPscience Secure synthesis and activation of rotocol translation agents This content has been downloaded from IOPscience. Please scroll down to see

More information

Service Network Design with Asset Management: Formulations and Comparative Analyzes

Service Network Design with Asset Management: Formulations and Comparative Analyzes Service Network Design with Asset Management: Formulations and Comarative Analyzes Jardar Andersen Teodor Gabriel Crainic Marielle Christiansen October 2007 CIRRELT-2007-40 Service Network Design with

More information

Branch-and-Price for Service Network Design with Asset Management Constraints

Branch-and-Price for Service Network Design with Asset Management Constraints Branch-and-Price for Servicee Network Design with Asset Management Constraints Jardar Andersen Roar Grønhaug Mariellee Christiansen Teodor Gabriel Crainic December 2007 CIRRELT-2007-55 Branch-and-Price

More information

CABRS CELLULAR AUTOMATON BASED MRI BRAIN SEGMENTATION

CABRS CELLULAR AUTOMATON BASED MRI BRAIN SEGMENTATION XI Conference "Medical Informatics & Technologies" - 2006 Rafał Henryk KARTASZYŃSKI *, Paweł MIKOŁAJCZAK ** MRI brain segmentation, CT tissue segmentation, Cellular Automaton, image rocessing, medical

More information

Evaluating a Web-Based Information System for Managing Master of Science Summer Projects

Evaluating a Web-Based Information System for Managing Master of Science Summer Projects Evaluating a Web-Based Information System for Managing Master of Science Summer Projects Till Rebenich University of Southamton tr08r@ecs.soton.ac.uk Andrew M. Gravell University of Southamton amg@ecs.soton.ac.uk

More information

A Novel Architecture Style: Diffused Cloud for Virtual Computing Lab

A Novel Architecture Style: Diffused Cloud for Virtual Computing Lab A Novel Architecture Style: Diffused Cloud for Virtual Comuting Lab Deven N. Shah Professor Terna College of Engg. & Technology Nerul, Mumbai Suhada Bhingarar Assistant Professor MIT College of Engg. Paud

More information

Static and Dynamic Properties of Small-world Connection Topologies Based on Transit-stub Networks

Static and Dynamic Properties of Small-world Connection Topologies Based on Transit-stub Networks Static and Dynamic Proerties of Small-world Connection Toologies Based on Transit-stub Networks Carlos Aguirre Fernando Corbacho Ramón Huerta Comuter Engineering Deartment, Universidad Autónoma de Madrid,

More information

Multiperiod Portfolio Optimization with General Transaction Costs

Multiperiod Portfolio Optimization with General Transaction Costs Multieriod Portfolio Otimization with General Transaction Costs Victor DeMiguel Deartment of Management Science and Oerations, London Business School, London NW1 4SA, UK, avmiguel@london.edu Xiaoling Mei

More information

Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi

Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi ICPP 6 th International Workshop on Parallel Programming Models and Systems Software for High-End Computing October 1, 2013 Lyon, France

More information

TOWARDS REAL-TIME METADATA FOR SENSOR-BASED NETWORKS AND GEOGRAPHIC DATABASES

TOWARDS REAL-TIME METADATA FOR SENSOR-BASED NETWORKS AND GEOGRAPHIC DATABASES TOWARDS REAL-TIME METADATA FOR SENSOR-BASED NETWORKS AND GEOGRAPHIC DATABASES C. Gutiérrez, S. Servigne, R. Laurini LIRIS, INSA Lyon, Bât. Blaise Pascal, 20 av. Albert Einstein 69621 Villeurbanne, France

More information

Comparing the OpenMP, MPI, and Hybrid Programming Paradigm on an SMP Cluster

Comparing the OpenMP, MPI, and Hybrid Programming Paradigm on an SMP Cluster Comparing the OpenMP, MPI, and Hybrid Programming Paradigm on an SMP Cluster Gabriele Jost and Haoqiang Jin NAS Division, NASA Ames Research Center, Moffett Field, CA 94035-1000 {gjost,hjin}@nas.nasa.gov

More information

Failure Behavior Analysis for Reliable Distributed Embedded Systems

Failure Behavior Analysis for Reliable Distributed Embedded Systems Failure Behavior Analysis for Reliable Distributed Embedded Systems Mario Tra, Bernd Schürmann, Torsten Tetteroo {tra schuerma tetteroo}@informatik.uni-kl.de Deartment of Comuter Science, University of

More information

Moving Objects Tracking in Video by Graph Cuts and Parameter Motion Model

Moving Objects Tracking in Video by Graph Cuts and Parameter Motion Model International Journal of Comuter Alications (0975 8887) Moving Objects Tracking in Video by Grah Cuts and Parameter Motion Model Khalid Housni, Driss Mammass IRF SIC laboratory, Faculty of sciences Agadir

More information

Multi-Channel Opportunistic Routing in Multi-Hop Wireless Networks

Multi-Channel Opportunistic Routing in Multi-Hop Wireless Networks Multi-Channel Oortunistic Routing in Multi-Ho Wireless Networks ANATOLIJ ZUBOW, MATHIAS KURTH and JENS-PETER REDLICH Humboldt University Berlin Unter den Linden 6, D-99 Berlin, Germany (zubow kurth jr)@informatik.hu-berlin.de

More information

A MOST PROBABLE POINT-BASED METHOD FOR RELIABILITY ANALYSIS, SENSITIVITY ANALYSIS AND DESIGN OPTIMIZATION

A MOST PROBABLE POINT-BASED METHOD FOR RELIABILITY ANALYSIS, SENSITIVITY ANALYSIS AND DESIGN OPTIMIZATION 9 th ASCE Secialty Conference on Probabilistic Mechanics and Structural Reliability PMC2004 Abstract A MOST PROBABLE POINT-BASED METHOD FOR RELIABILITY ANALYSIS, SENSITIVITY ANALYSIS AND DESIGN OPTIMIZATION

More information

EECS 122: Introduction to Communication Networks Homework 3 Solutions

EECS 122: Introduction to Communication Networks Homework 3 Solutions EECS 22: Introduction to Communication Networks Homework 3 Solutions Solution. a) We find out that one-layer subnetting does not work: indeed, 3 deartments need 5000 host addresses, so we need 3 bits (2

More information

Project Management and. Scheduling CHAPTER CONTENTS

Project Management and. Scheduling CHAPTER CONTENTS 6 Proect Management and Scheduling HAPTER ONTENTS 6.1 Introduction 6.2 Planning the Proect 6.3 Executing the Proect 6.7.1 Monitor 6.7.2 ontrol 6.7.3 losing 6.4 Proect Scheduling 6.5 ritical Path Method

More information

STATISTICAL CHARACTERIZATION OF THE RAILROAD SATELLITE CHANNEL AT KU-BAND

STATISTICAL CHARACTERIZATION OF THE RAILROAD SATELLITE CHANNEL AT KU-BAND STATISTICAL CHARACTERIZATION OF THE RAILROAD SATELLITE CHANNEL AT KU-BAND Giorgio Sciascia *, Sandro Scalise *, Harald Ernst * and Rodolfo Mura + * DLR (German Aerosace Centre) Institute for Communications

More information

Compensating Fund Managers for Risk-Adjusted Performance

Compensating Fund Managers for Risk-Adjusted Performance Comensating Fund Managers for Risk-Adjusted Performance Thomas S. Coleman Æquilibrium Investments, Ltd. Laurence B. Siegel The Ford Foundation Journal of Alternative Investments Winter 1999 In contrast

More information

Finding a Needle in a Haystack: Pinpointing Significant BGP Routing Changes in an IP Network

Finding a Needle in a Haystack: Pinpointing Significant BGP Routing Changes in an IP Network Finding a Needle in a Haystack: Pinointing Significant BGP Routing Changes in an IP Network Jian Wu, Zhuoqing Morley Mao University of Michigan Jennifer Rexford Princeton University Jia Wang AT&T Labs

More information

Managing specific risk in property portfolios

Managing specific risk in property portfolios Managing secific risk in roerty ortfolios Andrew Baum, PhD University of Reading, UK Peter Struemell OPC, London, UK Contact author: Andrew Baum Deartment of Real Estate and Planning University of Reading

More information

Analysis of Effectiveness of Web based E- Learning Through Information Technology

Analysis of Effectiveness of Web based E- Learning Through Information Technology International Journal of Soft Comuting and Engineering (IJSCE) Analysis of Effectiveness of Web based E- Learning Through Information Technology Anand Tamrakar, Kamal K. Mehta Abstract-Advancements of

More information

X How to Schedule a Cascade in an Arbitrary Graph

X How to Schedule a Cascade in an Arbitrary Graph X How to Schedule a Cascade in an Arbitrary Grah Flavio Chierichetti, Cornell University Jon Kleinberg, Cornell University Alessandro Panconesi, Saienza University When individuals in a social network

More information

Large-Scale IP Traceback in High-Speed Internet: Practical Techniques and Theoretical Foundation

Large-Scale IP Traceback in High-Speed Internet: Practical Techniques and Theoretical Foundation Large-Scale IP Traceback in High-Seed Internet: Practical Techniques and Theoretical Foundation Jun Li Minho Sung Jun (Jim) Xu College of Comuting Georgia Institute of Technology {junli,mhsung,jx}@cc.gatech.edu

More information

FIArch Workshop. Towards Future Internet Architecture. Brussels 22 nd February 2012

FIArch Workshop. Towards Future Internet Architecture. Brussels 22 nd February 2012 FIrch Worksho Brussels 22 nd February 2012 Towards Future Internet rchitecture lex Galis University College London a.galis@ee.ucl.ac.uk www.ee.ucl.ac.uk/~agalis FIrch Worksho Brussels 22 nd February 2012

More information

Two-resource stochastic capacity planning employing a Bayesian methodology

Two-resource stochastic capacity planning employing a Bayesian methodology Journal of the Oerational Research Society (23) 54, 1198 128 r 23 Oerational Research Society Ltd. All rights reserved. 16-5682/3 $25. www.algrave-journals.com/jors Two-resource stochastic caacity lanning

More information

The Advantage of Timely Intervention

The Advantage of Timely Intervention Journal of Exerimental Psychology: Learning, Memory, and Cognition 2004, Vol. 30, No. 4, 856 876 Coyright 2004 by the American Psychological Association 0278-7393/04/$12.00 DOI: 10.1037/0278-7393.30.4.856

More information

Migration to Object Oriented Platforms: A State Transformation Approach

Migration to Object Oriented Platforms: A State Transformation Approach Migration to Object Oriented Platforms: A State Transformation Aroach Ying Zou, Kostas Kontogiannis Det. of Electrical & Comuter Engineering University of Waterloo Waterloo, ON, N2L 3G1, Canada {yzou,

More information

Risk and Return. Sample chapter. e r t u i o p a s d f CHAPTER CONTENTS LEARNING OBJECTIVES. Chapter 7

Risk and Return. Sample chapter. e r t u i o p a s d f CHAPTER CONTENTS LEARNING OBJECTIVES. Chapter 7 Chater 7 Risk and Return LEARNING OBJECTIVES After studying this chater you should be able to: e r t u i o a s d f understand how return and risk are defined and measured understand the concet of risk

More information

An important observation in supply chain management, known as the bullwhip effect,

An important observation in supply chain management, known as the bullwhip effect, Quantifying the Bullwhi Effect in a Simle Suly Chain: The Imact of Forecasting, Lead Times, and Information Frank Chen Zvi Drezner Jennifer K. Ryan David Simchi-Levi Decision Sciences Deartment, National

More information

The fast Fourier transform method for the valuation of European style options in-the-money (ITM), at-the-money (ATM) and out-of-the-money (OTM)

The fast Fourier transform method for the valuation of European style options in-the-money (ITM), at-the-money (ATM) and out-of-the-money (OTM) Comutational and Alied Mathematics Journal 15; 1(1: 1-6 Published online January, 15 (htt://www.aascit.org/ournal/cam he fast Fourier transform method for the valuation of Euroean style otions in-the-money

More information

Partial-Order Planning Algorithms todomainfeatures. Information Sciences Institute University ofwaterloo

Partial-Order Planning Algorithms todomainfeatures. Information Sciences Institute University ofwaterloo Relating the Performance of Partial-Order Planning Algorithms todomainfeatures Craig A. Knoblock Qiang Yang Information Sciences Institute University ofwaterloo University of Southern California Comuter

More information

An Associative Memory Readout in ESN for Neural Action Potential Detection

An Associative Memory Readout in ESN for Neural Action Potential Detection g An Associative Memory Readout in ESN for Neural Action Potential Detection Nicolas J. Dedual, Mustafa C. Ozturk, Justin C. Sanchez and José C. Princie Abstract This aer describes how Echo State Networks

More information

A Simple Model of Pricing, Markups and Market. Power Under Demand Fluctuations

A Simple Model of Pricing, Markups and Market. Power Under Demand Fluctuations A Simle Model of Pricing, Markus and Market Power Under Demand Fluctuations Stanley S. Reynolds Deartment of Economics; University of Arizona; Tucson, AZ 85721 Bart J. Wilson Economic Science Laboratory;

More information

Optimal Routing and Scheduling in Transportation: Using Genetic Algorithm to Solve Difficult Optimization Problems

Optimal Routing and Scheduling in Transportation: Using Genetic Algorithm to Solve Difficult Optimization Problems By Partha Chakroborty Brics "The roblem of designing a good or efficient route set (or route network) for a transit system is a difficult otimization roblem which does not lend itself readily to mathematical

More information

ENFORCING SAFETY PROPERTIES IN WEB APPLICATIONS USING PETRI NETS

ENFORCING SAFETY PROPERTIES IN WEB APPLICATIONS USING PETRI NETS ENFORCING SAFETY PROPERTIES IN WEB APPLICATIONS USING PETRI NETS Liviu Grigore Comuter Science Deartment University of Illinois at Chicago Chicago, IL, 60607 lgrigore@cs.uic.edu Ugo Buy Comuter Science

More information

On the predictive content of the PPI on CPI inflation: the case of Mexico

On the predictive content of the PPI on CPI inflation: the case of Mexico On the redictive content of the PPI on inflation: the case of Mexico José Sidaoui, Carlos Caistrán, Daniel Chiquiar and Manuel Ramos-Francia 1 1. Introduction It would be natural to exect that shocks to

More information

Risk in Revenue Management and Dynamic Pricing

Risk in Revenue Management and Dynamic Pricing OPERATIONS RESEARCH Vol. 56, No. 2, March Aril 2008,. 326 343 issn 0030-364X eissn 1526-5463 08 5602 0326 informs doi 10.1287/ore.1070.0438 2008 INFORMS Risk in Revenue Management and Dynamic Pricing Yuri

More information

The Online Freeze-tag Problem

The Online Freeze-tag Problem The Online Freeze-tag Problem Mikael Hammar, Bengt J. Nilsson, and Mia Persson Atus Technologies AB, IDEON, SE-3 70 Lund, Sweden mikael.hammar@atus.com School of Technology and Society, Malmö University,

More information

Drinking water systems are vulnerable to

Drinking water systems are vulnerable to 34 UNIVERSITIES COUNCIL ON WATER RESOURCES ISSUE 129 PAGES 34-4 OCTOBER 24 Use of Systems Analysis to Assess and Minimize Water Security Risks James Uber Regan Murray and Robert Janke U. S. Environmental

More information

A Multivariate Statistical Analysis of Stock Trends. Abstract

A Multivariate Statistical Analysis of Stock Trends. Abstract A Multivariate Statistical Analysis of Stock Trends Aril Kerby Alma College Alma, MI James Lawrence Miami University Oxford, OH Abstract Is there a method to redict the stock market? What factors determine

More information

C-Bus Voltage Calculation

C-Bus Voltage Calculation D E S I G N E R N O T E S C-Bus Voltage Calculation Designer note number: 3-12-1256 Designer: Darren Snodgrass Contact Person: Darren Snodgrass Aroved: Date: Synosis: The guidelines used by installers

More information

Pressure Drop in Air Piping Systems Series of Technical White Papers from Ohio Medical Corporation

Pressure Drop in Air Piping Systems Series of Technical White Papers from Ohio Medical Corporation Pressure Dro in Air Piing Systems Series of Technical White Paers from Ohio Medical Cororation Ohio Medical Cororation Lakeside Drive Gurnee, IL 600 Phone: (800) 448-0770 Fax: (847) 855-604 info@ohiomedical.com

More information

ANALYSING THE OVERHEAD IN MOBILE AD-HOC NETWORK WITH A HIERARCHICAL ROUTING STRUCTURE

ANALYSING THE OVERHEAD IN MOBILE AD-HOC NETWORK WITH A HIERARCHICAL ROUTING STRUCTURE AALYSIG THE OVERHEAD I MOBILE AD-HOC ETWORK WITH A HIERARCHICAL ROUTIG STRUCTURE Johann Lóez, José M. Barceló, Jorge García-Vidal Technical University of Catalonia (UPC), C/Jordi Girona 1-3, 08034 Barcelona,

More information

Learning Human Behavior from Analyzing Activities in Virtual Environments

Learning Human Behavior from Analyzing Activities in Virtual Environments Learning Human Behavior from Analyzing Activities in Virtual Environments C. BAUCKHAGE 1, B. GORMAN 2, C. THURAU 3 & M. HUMPHRYS 2 1) Deutsche Telekom Laboratories, Berlin, Germany 2) Dublin City University,

More information

18-742 Lecture 4. Parallel Programming II. Homework & Reading. Page 1. Projects handout On Friday Form teams, groups of two

18-742 Lecture 4. Parallel Programming II. Homework & Reading. Page 1. Projects handout On Friday Form teams, groups of two age 1 18-742 Lecture 4 arallel rogramming II Spring 2005 rof. Babak Falsafi http://www.ece.cmu.edu/~ece742 write X Memory send X Memory read X Memory Slides developed in part by rofs. Adve, Falsafi, Hill,

More information

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 29, NO. 4, APRIL 2011 757. Load-Balancing Spectrum Decision for Cognitive Radio Networks

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 29, NO. 4, APRIL 2011 757. Load-Balancing Spectrum Decision for Cognitive Radio Networks IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 29, NO. 4, APRIL 20 757 Load-Balancing Sectrum Decision for Cognitive Radio Networks Li-Chun Wang, Fellow, IEEE, Chung-Wei Wang, Student Member, IEEE,

More information

NBER WORKING PAPER SERIES HOW MUCH OF CHINESE EXPORTS IS REALLY MADE IN CHINA? ASSESSING DOMESTIC VALUE-ADDED WHEN PROCESSING TRADE IS PERVASIVE

NBER WORKING PAPER SERIES HOW MUCH OF CHINESE EXPORTS IS REALLY MADE IN CHINA? ASSESSING DOMESTIC VALUE-ADDED WHEN PROCESSING TRADE IS PERVASIVE NBER WORKING PAPER SERIES HOW MUCH OF CHINESE EXPORTS IS REALLY MADE IN CHINA? ASSESSING DOMESTIC VALUE-ADDED WHEN PROCESSING TRADE IS PERVASIVE Robert Kooman Zhi Wang Shang-Jin Wei Working Paer 14109

More information

Softmax Model as Generalization upon Logistic Discrimination Suffers from Overfitting

Softmax Model as Generalization upon Logistic Discrimination Suffers from Overfitting Journal of Data Science 12(2014),563-574 Softmax Model as Generalization uon Logistic Discrimination Suffers from Overfitting F. Mohammadi Basatini 1 and Rahim Chiniardaz 2 1 Deartment of Statistics, Shoushtar

More information

On Growth of Parallelism within Routers and Its Impact on Packet Reordering 1

On Growth of Parallelism within Routers and Its Impact on Packet Reordering 1 On Growth of Parallelism within Routers and Its Imact on Reordering 1 A. A. Bare 1, A. P. Jayasumana, and N. M. Piratla 2 Deartment of Electrical & Comuter Engineering, Colorado State University, Fort

More information

Int. J. Advanced Networking and Applications Volume: 6 Issue: 4 Pages: 2386-2392 (2015) ISSN: 0975-0290

Int. J. Advanced Networking and Applications Volume: 6 Issue: 4 Pages: 2386-2392 (2015) ISSN: 0975-0290 2386 Survey: Biological Insired Comuting in the Network Security V Venkata Ramana Associate Professor, Deartment of CSE, CBIT, Proddatur, Y.S.R (dist), A.P-516360 Email: ramanacsecbit@gmail.com Y.Subba

More information

High Quality Offset Printing An Evolutionary Approach

High Quality Offset Printing An Evolutionary Approach High Quality Offset Printing An Evolutionary Aroach Ralf Joost Institute of Alied Microelectronics and omuter Engineering University of Rostock Rostock, 18051, Germany +49 381 498 7272 ralf.joost@uni-rostock.de

More information

The Changing Wage Return to an Undergraduate Education

The Changing Wage Return to an Undergraduate Education DISCUSSION PAPER SERIES IZA DP No. 1549 The Changing Wage Return to an Undergraduate Education Nigel C. O'Leary Peter J. Sloane March 2005 Forschungsinstitut zur Zukunft der Arbeit Institute for the Study

More information

2D Modeling of the consolidation of soft soils. Introduction

2D Modeling of the consolidation of soft soils. Introduction D Modeling of the consolidation of soft soils Matthias Haase, WISMUT GmbH, Chemnitz, Germany Mario Exner, WISMUT GmbH, Chemnitz, Germany Uwe Reichel, Technical University Chemnitz, Chemnitz, Germany Abstract:

More information

Software Cognitive Complexity Measure Based on Scope of Variables

Software Cognitive Complexity Measure Based on Scope of Variables Software Cognitive Comlexity Measure Based on Scoe of Variables Kwangmyong Rim and Yonghua Choe Faculty of Mathematics, Kim Il Sung University, D.P.R.K mathchoeyh@yahoo.com Abstract In this aer, we define

More information

This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore.

This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore. This document is downloaded from DR-NTU, Nanyang Technological University Library, Singaore. Title Automatic Robot Taing: Auto-Path Planning and Maniulation Author(s) Citation Yuan, Qilong; Lembono, Teguh

More information

Factoring Variations in Natural Images with Deep Gaussian Mixture Models

Factoring Variations in Natural Images with Deep Gaussian Mixture Models Factoring Variations in Natural Images with Dee Gaussian Mixture Models Aäron van den Oord, Benjamin Schrauwen Electronics and Information Systems deartment (ELIS), Ghent University {aaron.vandenoord,

More information

Automatic Search for Correlated Alarms

Automatic Search for Correlated Alarms Automatic Search for Correlated Alarms Klaus-Dieter Tuchs, Peter Tondl, Markus Radimirsch, Klaus Jobmann Institut für Allgemeine Nachrichtentechnik, Universität Hannover Aelstraße 9a, 0167 Hanover, Germany

More information

Forensic Science International

Forensic Science International Forensic Science International 214 (2012) 33 43 Contents lists available at ScienceDirect Forensic Science International jou r nal h o me age: w ww.els evier.co m/lo c ate/fo r sc iin t A robust detection

More information

Fundamental Concepts for Workflow Automation in Practice

Fundamental Concepts for Workflow Automation in Practice Fundamental Concets for Workflow Automation in Practice Stef Joosten and Sjaak Brinkkemer Centre for Telematics and Information Technology, University of Twente, The Netherlands March 12, 1995 This aer

More information

Design of A Knowledge Based Trouble Call System with Colored Petri Net Models

Design of A Knowledge Based Trouble Call System with Colored Petri Net Models 2005 IEEE/PES Transmission and Distribution Conference & Exhibition: Asia and Pacific Dalian, China Design of A Knowledge Based Trouble Call System with Colored Petri Net Models Hui-Jen Chuang, Chia-Hung

More information

17609: Continuous Data Protection Transforms the Game

17609: Continuous Data Protection Transforms the Game 17609: Continuous Data Protection Transforms the Game Wednesday, August 12, 2015: 8:30 AM-9:30 AM Southern Hemishere 5 (Walt Disney World Dolhin) Tony Negro - EMC Rebecca Levesque 21 st Century Software

More information

Pinhole Optics. OBJECTIVES To study the formation of an image without use of a lens.

Pinhole Optics. OBJECTIVES To study the formation of an image without use of a lens. Pinhole Otics Science, at bottom, is really anti-intellectual. It always distrusts ure reason and demands the roduction of the objective fact. H. L. Mencken (1880-1956) OBJECTIVES To study the formation

More information

Service Network Design with Asset Management: Formulations and Comparative Analyzes

Service Network Design with Asset Management: Formulations and Comparative Analyzes Service Network Design with Asset Management: Formulations and Comarative Analyzes Jardar Andersen Teodor Gabriel Crainic Marielle Christiansen October 2007 CIRRELT-2007-40 Service Network Design with

More information

TRANSMISSION Control Protocol (TCP) has been widely. On Parameter Tuning of Data Transfer Protocol GridFTP for Wide-Area Networks

TRANSMISSION Control Protocol (TCP) has been widely. On Parameter Tuning of Data Transfer Protocol GridFTP for Wide-Area Networks On Parameter Tuning of Data Transfer Protocol GridFTP for Wide-Area etworks Takeshi Ito, Hiroyuki Ohsaki, and Makoto Imase Abstract In wide-area Grid comuting, geograhically distributed comutational resources

More information

Minimizing the Communication Cost for Continuous Skyline Maintenance

Minimizing the Communication Cost for Continuous Skyline Maintenance Minimizing the Communication Cost for Continuous Skyline Maintenance Zhenjie Zhang, Reynold Cheng, Dimitris Paadias, Anthony K.H. Tung School of Comuting National University of Singaore {zhenjie,atung}@com.nus.edu.sg

More information

Re-Dispatch Approach for Congestion Relief in Deregulated Power Systems

Re-Dispatch Approach for Congestion Relief in Deregulated Power Systems Re-Disatch Aroach for Congestion Relief in Deregulated ower Systems Ch. Naga Raja Kumari #1, M. Anitha 2 #1, 2 Assistant rofessor, Det. of Electrical Engineering RVR & JC College of Engineering, Guntur-522019,

More information

Interaction Expressions A Powerful Formalism for Describing Inter-Workflow Dependencies

Interaction Expressions A Powerful Formalism for Describing Inter-Workflow Dependencies Interaction Exressions A Powerful Formalism for Describing Inter-Workflow Deendencies Christian Heinlein, Peter Dadam Det. Databases and Information Systems University of Ulm, Germany {heinlein,dadam}@informatik.uni-ulm.de

More information

Local Connectivity Tests to Identify Wormholes in Wireless Networks

Local Connectivity Tests to Identify Wormholes in Wireless Networks Local Connectivity Tests to Identify Wormholes in Wireless Networks Xiaomeng Ban Comuter Science Stony Brook University xban@cs.sunysb.edu Rik Sarkar Comuter Science Freie Universität Berlin sarkar@inf.fu-berlin.de

More information

The Priority R-Tree: A Practically Efficient and Worst-Case Optimal R-Tree

The Priority R-Tree: A Practically Efficient and Worst-Case Optimal R-Tree The Priority R-Tree: A Practically Efficient and Worst-Case Otimal R-Tree Lars Arge Deartment of Comuter Science Duke University, ox 90129 Durham, NC 27708-0129 USA large@cs.duke.edu Mark de erg Deartment

More information

On the (in)effectiveness of Probabilistic Marking for IP Traceback under DDoS Attacks

On the (in)effectiveness of Probabilistic Marking for IP Traceback under DDoS Attacks On the (in)effectiveness of Probabilistic Maring for IP Tracebac under DDoS Attacs Vamsi Paruchuri, Aran Durresi 2, and Ra Jain 3 University of Central Aransas, 2 Louisiana State University, 3 Washington

More information

Multicore Parallel Computing with OpenMP

Multicore Parallel Computing with OpenMP Multicore Parallel Computing with OpenMP Tan Chee Chiang (SVU/Academic Computing, Computer Centre) 1. OpenMP Programming The death of OpenMP was anticipated when cluster systems rapidly replaced large

More information

A Modified Measure of Covert Network Performance

A Modified Measure of Covert Network Performance A Modified Measure of Covert Network Performance LYNNE L DOTY Marist College Deartment of Mathematics Poughkeesie, NY UNITED STATES lynnedoty@maristedu Abstract: In a covert network the need for secrecy

More information

Improving System Scalability of OpenMP Applications Using Large Page Support

Improving System Scalability of OpenMP Applications Using Large Page Support Improving Scalability of OpenMP Applications on Multi-core Systems Using Large Page Support Ranjit Noronha and Dhabaleswar K. Panda Network Based Computing Laboratory (NBCL) The Ohio State University Outline

More information

Sage Document Management. User's Guide Version 13.1

Sage Document Management. User's Guide Version 13.1 Sage Document Management User's Guide Version 13.1 This is a ublication of Sage Software, Inc. Version 13.1 Last udated: June 19, 2013 Coyright 2013. Sage Software, Inc. All rights reserved. Sage, the

More information

Sage HRMS I Planning Guide. The Complete Buyer s Guide for Payroll Software

Sage HRMS I Planning Guide. The Complete Buyer s Guide for Payroll Software I Planning Guide The Comlete Buyer s Guide for Payroll Software Table of Contents Introduction... 1 Recent Payroll Trends... 2 Payroll Automation With Emloyee Self-Service... 2 Analyzing Your Current Payroll

More information

Efficient Training of Kalman Algorithm for MIMO Channel Tracking

Efficient Training of Kalman Algorithm for MIMO Channel Tracking Efficient Training of Kalman Algorithm for MIMO Channel Tracking Emna Eitel and Joachim Seidel Institute of Telecommunications, University of Stuttgart Stuttgart, Germany Abstract In this aer, a Kalman

More information

CRITICAL AVIATION INFRASTRUCTURES VULNERABILITY ASSESSMENT TO TERRORIST THREATS

CRITICAL AVIATION INFRASTRUCTURES VULNERABILITY ASSESSMENT TO TERRORIST THREATS Review of the Air Force Academy No (23) 203 CRITICAL AVIATION INFRASTRUCTURES VULNERABILITY ASSESSMENT TO TERRORIST THREATS Cătălin CIOACĂ Henri Coandă Air Force Academy, Braşov, Romania Abstract: The

More information

Joint Production and Financing Decisions: Modeling and Analysis

Joint Production and Financing Decisions: Modeling and Analysis Joint Production and Financing Decisions: Modeling and Analysis Xiaodong Xu John R. Birge Deartment of Industrial Engineering and Management Sciences, Northwestern University, Evanston, Illinois 60208,

More information