Parallel Computing with Bayesian MCMC and Data Cloning in R with the dclone package
|
|
|
- Bruce Wilkinson
- 10 years ago
- Views:
Transcription
1 Parallel Computing with Bayesian MCMC and Data Cloning in R with the dclone package Péter Sólymos University of Alberta Abstract The dclone R package provides parallel computing features for Bayesian MCMC computations and the use of the data cloning algorithm. Parallelization can be achieved by running independent parallel MCMC chains or by partitioning the problem into subsets. Size balancing is introduced as a simple workload optimization algorithm applicable when processing time for the pieces differ significantly. The speed-up with parallel computing infrastructure can facilitate the analysis of larger data and complex hierarchical models building upon commoly available Bayesian software. Keywords: Bayesian statistics, data cloning, maximum likelihood inference, generalized linear models, R, parallel computing. 1. Introduction Large data and complex hierarchical models pose a challenge to statistical computations. For this end researchers often use parallel computing, especially for embarassingly parallel problems such as Markov chain Monte Carlo (MCMC) methods. R (R Development Core Team 2010) is an open source software and environment for statistical computing and has interfaces for most commonly used Bayesian software (JAGS, Plummer (2010); WinBUGS, Spiegelhalter et al. (2003); and OpenBUGS Spiegelhalter et al. (2007)). R also has capabilities to exploit capacity of multi-core machines or idle computers in a network through various contributed packages (Schmidberger et al. 2009). R is thus a suitable platform for parallelizing Bayesian MCMC computations, which is not a built-in feature in commonly used Bayesian software. Data cloning is a global optimization algorithm that exploits Markov chain Monte Carlo (MCMC) methods used in the Bayesian statistical framework while providing valid frequentist inferences such as the maximum likelihood estimates and their standard errors (Lele et al. 2007, 2010). This approach enables to fit complex hierarchical models (Lele et al. 2007; Ponciano et al. 2009) and helps in studying parameter identifiability issues (Lele et al. 2010; Sólymos 2010), but comes with a price in processing time that increases with the number of clones by repeating the same data many times, thus increasing graph dimensions in the Bayesian software used. Again, parallelization naturally follows as a reply to the often demanding computational requirements of data cloning. The R package dclone (Sólymos 2010) provide convenience functions for Bayesian computations and data cloning. The package also has functions for parallel computations, building on
2 2 Parallel computing with MCMC and data cloning the parallel computing infrastructure provided by the snow package (Tierney et al. 2010). In this paper I present best practices for parallel Bayesian computations in R, describe how the wrapper functions in the dclone package work, demonstrate how computing time can improve by these functions and how parallelization can be effectively optimized. 2. Bayesian example We will use the seeds example from the WinBUGS Manual Vol. I. (Spiegelhalter et al. 2003) based on Table 3 of Crowder (1978). The experiment concerns the proportion of seeds that germinated on each of 21 plates arranged according to a 2 2 factorial layout by seed (x 1i ) and type of root extract (x 2i ). The data are the number of germinated (r i ) and the total number of seeds (n i ) on the ith plate, i = 1,..., N: R> source(" R> str(dat <- seeds$data) List of 5 $ N : num 21 $ r : num [1:21] $ n : num [1:21] $ x1: num [1:21] $ x2: num [1:21] The Bayesian model is random effects logistic, allowing for overdispersion: r i Binomial(p i, n i ) logit(p i ) = α 0 + α 1 x 1i + α 2 x 2i + α 12 x 1i x 2i + b i b i Normal(0, σ 2 ), where p i is the probability of germination on the ith plate, and α 0, α 1, α 2, α 12 and precision parameter τ = 1/σ 2 are given independent noninformative priors. The corresponding BUGS model is: R> (model <- seeds$model) function() { alpha0 ~ dnorm(0.0,1.0e-6); # intercept alpha1 ~ dnorm(0.0,1.0e-6); # seed coeff alpha2 ~ dnorm(0.0,1.0e-6); # extract coeff alpha12 ~ dnorm(0.0,1.0e-6); # intercept tau ~ dgamma(1.0e-3,1.0e-3); # 1/sigma^2 sigma <- 1.0/sqrt(tau); for (i in 1:N) { b[i] ~ dnorm(0.0,tau); logit(p[i]) <- alpha0 + alpha1*x1[i] + alpha2*x2[i] + alpha12*x1[i]*x2[i] + b[i]; r[i] ~ dbin(p[i],n[i]); } } Initial values are taken from the WinBUGS manual
3 Péter Sólymos 3 R> str(inits <- seeds$inits) List of 5 $ tau : num 1 $ alpha0 : num 0 $ alpha1 : num 0 $ alpha2 : num 0 $ alpha12: num 0 We are using the following settings (n.adapt + n.update = burn-in iterations, n.iter is the number of samples taken from the posterior distribution, thin is the thinning value, n.chains is the number of chains used) for fitting the JAGS model and monitoring the parameters in params: R> n.adapt < R> n.update < R> n.iter < R> thin <- 10 R> n.chains <- 3 R> params <- c("alpha0", "alpha1", "alpha2", "alpha12", "sigma") The full Bayesian results can be obtained via JAGS as R> m <- jags.fit(data = dat, params = params, model = model, + inits = inits, n.adapt = n.adapt, n.update = n.update, + n.iter = n.iter, thin = thin, n.chains = n.chains) 3. Parallel MCMC chains Markov chain Monte Carlo (MCMC) methods represent an embarassingly parallel problem (Rossini et al. 2003; Schmidberger et al. 2009) and efforts are being made to enable effective parallel computing techniques for Bayesian statistics (Wilkinson 2005), but popular software for fitting graphical models using some dialect of the BUGS language (e.g. WinBUGS, Open- BUGS, JAGS) do not yet come with a built in facility for enabling parallel computations (see critique for Lunn et al. (2009)). WinBUGS has been criticized that the indepencence of parallel chains cannot be guaranteed via setting random number generator (RNG) seed (Wilkinson 2005). Consequently, parallelizing approaches built on WinBUGS (e.g. GridBUGS Girgis et al. (2007); or bugsparallel Metrum Institute (2010)) are not best suited for all types of parallel Bayesian computations (e.g. those are best suited for batch processing, but not for generating truly independent chains). One might argue that the return times of RNG algorithms are large enough to be safe to use a different seeds approach, but this approach might be risky for long chains as parallel streams might have some overlap, and parts of the sequences might not be independent (see Wilkinson (2005) for discussion). Also, using proper parallel RNG of the rlecuyer package (Sevcikova and Rossini 2009) used in the R session to ensure independence of initial values or random seeds will not guarantee chain independence outside of R (e.g. this is the case in the bugsparallel project according to its source code). Random numbers for initialization in JAGS are defined by the type and seed of the RNG, and currently there are four different types of RNGs available (base::wichmann-hill, base::-
4 4 Parallel computing with MCMC and data cloning Marsaglia-Multicarry, base::super-duper, base::mersenne-twister). The model initialization by default uses different RNG types for up to four chains, which naturally leads to safe parallelization because these can represent independent streams on each processor. Now let us fit the model by using MCMC chains computed on parallel workers. We use three workers (one worker for each MCMC chain, the worker type we use here is socket, but type can be anything else accepted by the snow package, such as "PVM", "MPI", and "NWS"): R> library(snow) R> cl <- makecluster(3, type = "SOCK") Because MCMC computations (except for generating user supplied initial values) are done outside of R, we just have to ensure that model initialization is done properly and initial values contain different.rng.state and.rng.name values. Up to four chains (JAGS has 4 RNG types) this can be most easily achieved by R> inits2 <- jags.fit(dat, params, model, inits, n.chains, + n.adapt = 0, n.update = 0, n.iter = 0)$state(internal = TRUE) Note that this initialization might represent some overhead especially for models with many latent nodes, but this is the most secure way of ensuring the independence of MCMC chains. We have to load the dclone package on the workers and set a common working directory where the model file can be found (a single model file is used to avoid multiple read/write requests): R> clusterevalq(cl, library(dclone)) R> clusterevalq(cl, setwd(getwd())) R> filename <- write.jags.model(model) Using a common file is preferred when the mamory is shared (e.g. for SOCK cluster type). If computations are made on coputing grid or cluster without shared memory, each machine needs a model file written onto the hard drive. For this, one can use the write.jags.model function with argument overwite = TRUE to ensure all worker has the same model file name. In this case, the name of the model file can be provided as a character vector in subsequent steps. The next step is to distribute the data to the workers R> cldata <- list(data=dat, params=params, model=filename, inits=inits2) R> clusterexport(cl, "cldata") and create a function that will be evaluated on each worker R> jagsparallel <- function(i,...) { + jags.fit(data = cldata$data, params = cldata$params, model = cldata$model, + inits = cldata$inits[[i]], n.chains = 1, updated.model = FALSE,...) + } Now we can use the parlapply function of the snow package and then clean up the model file from the hard drive R> res <- parlapply(cl, 1:n.chains, jagsparallel, + n.adapt = n.adapt, n.update = n.update, n.iter = n.iter, thin = thin) R> res <- as.mcmc.list(lapply(res, as.mcmc)) R> clean.jags.model(filename)
5 Péter Sólymos 5 The jags.parfit function in the dclone package is a wrapper for this process, here we can use the model function instead of a model file R> m2 <- jags.parfit(cl, data = dat, params = params, model = model, + inits = inits, n.adapt = n.adapt, n.update = n.update, + n.iter = n.iter, thin = thin, n.chains = n.chains) R> stopcluster(cl) The jags.parfit is the parallel counterpart of the jags.fit function. It internally uses the snowwrapper function because clusterexport exports objects from the global environment and not from its child environments (i.e. from inside the jags.parfit function). The snowwrapper function takes care of name clashes and also can be used to set the parallel random number generation type via the clustersetuprng function of snow. The dclone package can use the L Ecuyer s RNG (L Ecuyer et al. 2002) ("RNGstream", requires the rlecuyer package, Sevcikova and Rossini (2009)) or "SPRNG" (requires the rsprng package, Li (2010)). The type of parallel RNG can be set up by the dcoptions function and by default it is "none" (not setting up RNG on the workers). Setting the parallel RNG can be important when using random numbers as initial values supplied as function, so the function is evaluated on the workers. To set the RNG type, we use R> dcoptions(rng = "RNGstream") R> dcoptions()$rng [1] "RNGstream" Table 1 shows different ways of providing initial values. Besides setting up RNGs, we might have to export objects to the workers using the clusterexport function of the snow package if inits contains a reference to objects defined in the global environment. 4. Parallel data cloning Data cloning uses MCMC methods to calculate the posterior distribution of the model parameters conditional on the data. If the number of clones (k) is large enough, the mean of the posterior distribution is the maximum likelihood estimate and k times the posterior variance is the corresponding asymptotic variance of the maximum likelihood estimate (Lele et al. 2007, 2010). But there is no universal number of clones that is needed to reach convergence, so the number of clones has to be determined empirically by using e.g. the dctable and dcdiag functions provided by the dclone package (Sólymos 2010). For this, we have to fit the same model with the different number of clones of the data, and check if the variances of the parameters are going to zero (Lele et al. 2010). Fitting the model with number of clones, say (1, 2, 5, 10, 25), means that we have to fit the model five times. Without parallelization, we can use subsequent calls of the jags.fit function with different data sets corresponding to different number of clones, or use the dc.fit wrapper function of the dclone R package. Parallelization can happen in two ways: (1) we parallelize the computation of the MCMC chains (ususally we use more than one chain to check proper mixing behaviour of the chains); or (2) we split the problem into subsets and run the subsets on different worker of the parallel computing environment. The first type of parallelization can be addressed by multiple calls to the jags.parfit function. The second type of parallelization is implemented in the dc.parfit function that is the parallelized version of the function dc.fit.
6 6 Parallel computing with MCMC and data cloning inits Effect Notes for parallel execution NULL Initial values automatically generated with RNG type and seed Self contained named inits used, RNG type and list seed generated if missing Named list with references ditto to global environ- ment Self contained function Function used to generate inits, RNG type and seed generated if missing Function using random ditto numbers Fuction with references to global environment ditto None None Objects must be exported by clusterexport() None Parallel RNG should be set by dcoptions() or clustersetuprng() Objects must be exported by clusterexport() Table 1: Different ways of providing initial values and notes for parallel executions. Self contained means that the object has no reference to other objects or environments, neither random numbers are generated within a self contained function Data cloning with parallel MCMC chains The following simple function allows us to switch parallelization on/off (by setting the parallel argument as TRUE/FALSE, respectively), and also measure the processing time in minutes: R> timerfitfun <- function(parallel = FALSE,...) { + t0 <- proc.time() + mod <- if (parallel) + jags.parfit(...) else jags.fit(...) + attr(mod, "timer") <- (proc.time() - t0)[3] / 60 + mod + } The vector n.clones contains the number of clones to be used: R> n.clones <- c(1, 2, 5, 10, 25) First we fit the five models sequentially (one model for each element of n.clones) without parallelization of the three MCMC chains: R> res1 <- lapply(n.clones, function(z) + timerfitfun(parallel = FALSE, + data = dclone(dat, n.clones= z, multiply = "N"), + params = params, + model = model, + inits = inits, + n.adapt = n.adapt, + n.update = n.update,
7 Péter Sólymos 7 + n.iter = n.iter, + thin = thin, + n.chains = n.chains)) Now we fit the models by using MCMC chains computed on parallel workers. We use three workers, one worker for each MCMC chain: R> cl <- makecluster(3, type = "SOCK") R> res2 <- lapply(n.clones, function(z) + timerfitfun(parallel = TRUE, + cl = cl, + data = dclone(dat, n.clones = z, multiply = "N"), + params = params, + model = model, + inits = inits, + n.adapt = n.adapt, + n.update = n.update, + n.iter = n.iter, + thin = thin, + n.chains = n.chains)) R> stopcluster(cl) We can extract timer information from the result objects 1 : R> pt1 <- sapply(res1, function(z) attr(z, "timer")) R> pt2 <- sapply(res2, function(z) attr(z, "timer")) Processing time (column time) increased linearly with n.clones (with and without parallelization), columns rel.change indicate change relative to n.clones = 1, speedup is the speed-up of the parallel computation relative to its sequential counterpart: R> tab1 <- data.frame(n.clones=n.clones, + time=pt1, + rel.change = pt1 / pt1[1], + time.par=pt2, + rel.change.par = pt2 / pt2[1], + speedup = pt2 / pt1) R> round(tab1, 2) n.clones time rel.change time.par rel.change.par speedup It took a total of 5.77 minutes to fit the five models without, and only 1.99 minutes (34.4 %) with parallel MCMC chains. Parallel MCMC computations have effectively reduced the processing time almost at a rate of 1/(number of chains). During iterative model fitting for data cloning using parallel MCMC chains, it is possible to use the joint posterior distribution from the previous iteration as a prior distribution for 1 All computations presented here were run on a computer with Intel Core i7-920 CPU, 3 GB RAM and Windows XP operating system.
8 8 Parallel computing with MCMC and data cloning the next iteration (possibly defined as a Multivariate Normal prior, see Sólymos (2010) for implementation) Partitioning into parallel subsets The other way of parallelizing the iterative fitting procedure with data cloning is to split the problem into subsets. The snow package has the clustersplit function to split a problem into equal sized subsets, and it is then used in e.g. the parlapply function. In this aproach, each piece is assumed to be of equal size. The function would split the n.clones vector on two workers as R> clustersplit(1:2, n.clones) [[1]] [1] [[2]] [1] But we have already learnt that running the first subset would take 1.1 while the second would take 4.67 minutes, thus the speed-up would be only We have also learnt that computing time scaled linearly with the number of clones, thus we can use this information to optimize the work load. The clustersplitsb function of the dclone package implements size balancing that splits the problem into subsets with respect to size (i.e. approximate processing time). The function parlapplysb uses these subsets similarly to parlapply. Our n.clones vector is partitioned with two workers as R> clustersplitsb(1:2, n.clones, size = n.clones) [[1]] [1] 25 [[2]] [1] The expected speed-up based on results in pt1 is In size balancing, the problems are re-ordered from largest to smallest, and then subsets are determined by minimizing the total approximate processing time. This splitting is deterministic, thus computations are reproducible. However, size balancing can be combined with load balancing (Rossini et al. 2003). This option is implemented in the parlapplyslb function. Load balancing is useful for utilizing inhomogeneous clusters in which nodes differ significantly in performance. With load balancing, the jobs for the initial list elements are placed on the processors, and the remaining jobs are assigned to the first available processor until all jobs are complete. If size (processing time) is correct, size plus load balancing should be identical to size balancing, but the actual sequence of execution might depend on unequal performance of the workers. The splitting in this case is non-deterministic (might not be reproducible). If the performance of the workers is known to differ significantly it is possible to switch to load balancing by setting dcoptions(lb = TRUE), but it is turned off (FALSE) by default. The clustersize function can be used to determine the optimal number of workers needed for an (approximate) size vecor:
9 Péter Sólymos 9 No Balancing Load Balancing Size Balancing Workers Workers Workers Approximate Processing Time Max = Approximate Processing Time Max = Approximate Processing Time Max = 25 Figure 1: Effect of balancing type on approximate total processing time by showing the subsets on each worker. Note, the actual sequence of execution might be different for load balancing, because it is non-deterministic. For data cloning, relative performace of balancing options can be well approximated by using the vector for the number of clones to be used. R> clustersize(n.clones) workers none load size both The function returns approximate processing times (based on the size vector) as a function of the number of workers and balancing type. As we can see, only two workers are needed to accomplish the task by size balancing. "none" refers to splitting the problem into subsets without load or size balancing, "both" refers to load and size balancing, but it is identical to "size" balancing as improvement due to load balancing cannot be determined a priori. The plotclustersize function can help to visualize the effect of using different balancing options: R> opar <- par(mfrow=c(1,3), cex.lab=1.5, cex.main=1.5, cex.sub=1.5) R> col <- heat.colors(length(n.clones)) R> xlim <- c(0, max(clustersize(n.clones)[2,-1])) R> plotclustersize(2, n.clones, "none", col = col, xlim = xlim) R> plotclustersize(2, n.clones, "load", col = col, xlim = xlim) R> plotclustersize(2, n.clones, "size", col = col, xlim = xlim) R> par(opar) We can see in Fig. 1, that, for two workers, size balancing results in the shortest processing time. While the model with max(n.clones) is running on one worker, all the other models can be fitted on the other worker. Total processing time is determined by the largest problem size. First, we fit the five models iteratively without parallelization by using the dc.fit function of the dclone package:
10 10 Parallel computing with MCMC and data cloning R> t0 <- proc.time() R> res3 <- dc.fit(dat, params, model, inits, + n.clones = n.clones, multiply = "N", + n.adapt = n.adapt, n.update = n.update, + n.iter = n.iter, thin = thin, n.chains = n.chains) R> attr(res3, "timer") <- (proc.time() - t0)[3] / 60 Now let us fit the same models with parallelization using socket cluster type with two workers. The default balancing option in dc.parfit is size balancing. The function collects only descriptive statistics of the MCMC objects when number of clones is smaller than the maximum, similarly to dc.fit. The whole MCMC object is returned for the MCMC object with the largest number of clones. This approach reduces communication overhead. R> cl <- makecluster(2, type = "SOCK") R> t0 <- proc.time() R> res4 <- dc.parfit(cl, dat, params, model, inits, + n.clones = n.clones, multiply = "N", + n.adapt = n.adapt, n.update = n.update, + n.iter = n.iter, thin = thin, n.chains = n.chains) R> attr(res4, "timer") <- (proc.time() - t0)[3] / 60 R> stopcluster(cl) Note, that by using the parallel partitioning approach to data cloning, it is not possible to use the posterior distribution as a prior in the next iteration with higher number of clones. It is only possible to use the same initial values for all runs, as opposed to the sequential dc.fit function (see Sólymos (2010) and help("dc.fit") for a description of the arguments update, updatefun, and initsfun). But opposed to the jags.parfit function which can work only with JAGS, the dc.parfit function can work with WinBUGS and OpenBUGS as well besides JAGS (see documentation for arguments flavour and program and examples on help pages help("dc.parfit") and help("dc.fit")). It took 5.74 minutes without parallelization, as compared to 3.3 minutes (57.54 %) with parallelization. The parallel processing time in this case was close to the time needed to fit the model with max(n.clones) = 25 clones without parallelization, so it was really the larges problem that determined the processing time. 5. Conclusions The paper described the parallel computing features provided by the dclone R package for Bayesian MCMC computations and the use of the data cloning algorithm. Large amount of parallelism can be achieved by many ways, e.g. parallelizing single chain (Wilkinson 2005) and via graphical processing units (Suchard and Rambaut 2009; Lee et al. 2010). The parellel chains approach can be highly effective for Markov chains with good mixing and reasonably low burn-in times. The jags.parfit function fits models with parallel chains on separate workers by safely initializing the models, including RNG setup. Parallel MCMC chain computations are currently implemented to work with JAGS. Size balancing is introduced as a simple workload optimization algorithm applicable when processing time for the pieces differ significantly. The package contains utility functions that enable size balancing and help optimizing the workload. The dc.parfit function uses size
11 Péter Sólymos 11 balancing to iteratively fit models on multiple workers for data cloning. dc.parfit can work with JAGS, WinBUGS or OpenBUGS. The functions presented here can facilitate the analysis of larger data and complex hierarchical models building upon commonly available Bayesian software. 6. Acknowledgments I am grateful to Subhash Lele and Khurram Nadeem for valuable discussions. Comments from Christian Robert and the Associate Editor greatly improved the manuscript. References Crowder M (1978). Beta-Binomial ANOVA for proportions. Applied Statistics, 27, Girgis IG, Schaible T, Nandy P, De Ridder F, Mathers J, Mohanty S (2007). Parallel Bayesian methodology for population analysis. URL L Ecuyer P, Simard R, Chen EJ, Kelton WD (2002). An Object-Oriented Random-Number Package With Many Long Streams and Substreams. Operations Research, 50(6), Lee A, Yau C, Giles M, Doucet A, Holmes C (2010). On the utility of graphics cards to perform massively parallel simulation of advanced Monte Carlo methods. Journal of Computational and Graphical Statistics, 19(4), Lele SR, Dennis B, Lutscher F (2007). Data cloning: easy maximum likelihood estimation for complex ecological models using Bayesian Markov chain Monte Carlo methods. Ecology Letters, 10, Lele SR, Nadeem K, Schmuland B (2010). Estimability and likelihood inference for generalized linear mixed models using data cloning. Journal of the American Statistical Association, 105, In press. Li N (2010). rsprng: R interface to SPRNG (Scalable Parallel Random Number Generators). R package version 1.0, URL Lunn D, Spiegelhalter D, Thomas A, Best N (2009). The BUGS project: Evolution, critique and future directions. Statistics in Medicine, 28, With discussion. Metrum Institute (2010). bugsparallel: R scripts for parallel computation of multiple MCMC chains with WinBUGS. URL Plummer M (2010). JAGS Version manual (November 7, 2010). URL http: //mcmc-jags.sourceforge.net. Ponciano JM, Taper ML, Dennis B, Lele SR (2009). Hierarchical models in ecology: confidence intervals, hypothesis testing, and model selection using data cloning. Ecology, 90,
12 12 Parallel computing with MCMC and data cloning R Development Core Team (2010). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN , URL Rossini A, Tierney L, Li N (2003). Simple Parallel Statistical Computing in R. UW Biostatistics Working Paper Series, Working Paper 193., URL com/uwbiostat/paper193. Schmidberger M, Morgan M, Eddelbuettel D, Yu H, Tierney L, Mansmann U (2009). State of the Art in Parallel Computing with R. Journal of Statistical Software, 31(1), ISSN URL Sevcikova H, Rossini T (2009). rlecuyer: R interface to RNG with multiple streams. R package version 0.3-1, URL Sólymos P (2010). dclone: Data Cloning in R. The R Journal, 2(2), R package version: 1.3-1, URL Spiegelhalter D, Thomas A, Best N, Lunn D (2007). OpenBUGS User Manual, Version 3.0.2, September URL Spiegelhalter DJ, Thomas A, Best NG, Lunn D (2003). WinBUGS Version 1.4 Users Manual. MRC Biostatistics Unit, Cambridge. URL Suchard MA, Rambaut A (2009). Many-Core Algorithms for Statistical Phylogenetics. Bioinformatics, 25(11), Tierney L, Rossini AJ, Li N, Sevcikova H (2010). snow: Simple Network of Workstations. R package version 0.3-3, URL Wilkinson DJ (2005). Parallel Bayesian Computation. In EJ Kontoghiorghes (ed.), Handbook of Parallel Computing and Statistics, pp Marcel Dekker/CRC Press. Affiliation: Péter Sólymos Alberta Biodiversity Monitoring Institute and Boreal Avian Modelling Project Department of Biological Sciences CW 405, Biological Sciences Bldg University of Alberta Edmonton, Alberta, T6G 2E9, Canada [email protected]
Introduction to parallel computing in R
Introduction to parallel computing in R Clint Leach April 10, 2014 1 Motivation When working with R, you will often encounter situations in which you need to repeat a computation, or a series of computations,
A Latent Variable Approach to Validate Credit Rating Systems using R
A Latent Variable Approach to Validate Credit Rating Systems using R Chicago, April 24, 2009 Bettina Grün a, Paul Hofmarcher a, Kurt Hornik a, Christoph Leitner a, Stefan Pichler a a WU Wien Grün/Hofmarcher/Hornik/Leitner/Pichler
Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014
Probabilistic Models for Big Data Alex Davies and Roger Frigola University of Cambridge 13th February 2014 The State of Big Data Why probabilistic models for Big Data? 1. If you don t have to worry about
R2MLwiN Using the multilevel modelling software package MLwiN from R
Using the multilevel modelling software package MLwiN from R Richard Parker Zhengzheng Zhang Chris Charlton George Leckie Bill Browne Centre for Multilevel Modelling (CMM) University of Bristol Using the
CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS
Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships
Statistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University [email protected]
Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University [email protected] 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian
22S:295 Seminar in Applied Statistics High Performance Computing in Statistics
22S:295 Seminar in Applied Statistics High Performance Computing in Statistics Luke Tierney Department of Statistics & Actuarial Science University of Iowa August 30, 2007 Luke Tierney (U. of Iowa) HPC
More details on the inputs, functionality, and output can be found below.
Overview: The SMEEACT (Software for More Efficient, Ethical, and Affordable Clinical Trials) web interface (http://research.mdacc.tmc.edu/smeeactweb) implements a single analysis of a two-armed trial comparing
Model Calibration with Open Source Software: R and Friends. Dr. Heiko Frings Mathematical Risk Consulting
Model with Open Source Software: and Friends Dr. Heiko Frings Mathematical isk Consulting Bern, 01.09.2011 Agenda in a Friends Model with & Friends o o o Overview First instance: An Extreme Value Example
Parallelization Strategies for Multicore Data Analysis
Parallelization Strategies for Multicore Data Analysis Wei-Chen Chen 1 Russell Zaretzki 2 1 University of Tennessee, Dept of EEB 2 University of Tennessee, Dept. Statistics, Operations, and Management
Analyzing Clinical Trial Data via the Bayesian Multiple Logistic Random Effects Model
Analyzing Clinical Trial Data via the Bayesian Multiple Logistic Random Effects Model Bartolucci, A.A 1, Singh, K.P 2 and Bae, S.J 2 1 Dept. of Biostatistics, University of Alabama at Birmingham, Birmingham,
Scalable Data Analysis in R. Lee E. Edlefsen Chief Scientist UserR! 2011
Scalable Data Analysis in R Lee E. Edlefsen Chief Scientist UserR! 2011 1 Introduction Our ability to collect and store data has rapidly been outpacing our ability to analyze it We need scalable data analysis
Applications of R Software in Bayesian Data Analysis
Article International Journal of Information Science and System, 2012, 1(1): 7-23 International Journal of Information Science and System Journal homepage: www.modernscientificpress.com/journals/ijinfosci.aspx
Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach
Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach Refik Soyer * Department of Management Science The George Washington University M. Murat Tarimcilar Department of Management Science
Bayesian networks - Time-series models - Apache Spark & Scala
Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly
Learning outcomes. Knowledge and understanding. Competence and skills
Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges
STATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
Towards running complex models on big data
Towards running complex models on big data Working with all the genomes in the world without changing the model (too much) Daniel Lawson Heilbronn Institute, University of Bristol 2013 1 / 17 Motivation
SAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.
Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are
Journal of Statistical Software
JSS Journal of Statistical Software October 2014, Volume 61, Issue 7. http://www.jstatsoft.org/ WebBUGS: Conducting Bayesian Statistical Analysis Online Zhiyong Zhang University of Notre Dame Abstract
Validation of Software for Bayesian Models using Posterior Quantiles. Samantha R. Cook Andrew Gelman Donald B. Rubin DRAFT
Validation of Software for Bayesian Models using Posterior Quantiles Samantha R. Cook Andrew Gelman Donald B. Rubin DRAFT Abstract We present a simulation-based method designed to establish that software
Handling attrition and non-response in longitudinal data
Longitudinal and Life Course Studies 2009 Volume 1 Issue 1 Pp 63-72 Handling attrition and non-response in longitudinal data Harvey Goldstein University of Bristol Correspondence. Professor H. Goldstein
Model-based Synthesis. Tony O Hagan
Model-based Synthesis Tony O Hagan Stochastic models Synthesising evidence through a statistical model 2 Evidence Synthesis (Session 3), Helsinki, 28/10/11 Graphical modelling The kinds of models that
Programming Using Python
Introduction to Computation and Programming Using Python Revised and Expanded Edition John V. Guttag The MIT Press Cambridge, Massachusetts London, England CONTENTS PREFACE xiii ACKNOWLEDGMENTS xv 1 GETTING
Bayesian Statistics: Indian Buffet Process
Bayesian Statistics: Indian Buffet Process Ilker Yildirim Department of Brain and Cognitive Sciences University of Rochester Rochester, NY 14627 August 2012 Reference: Most of the material in this note
Parallel Ray Tracing using MPI: A Dynamic Load-balancing Approach
Parallel Ray Tracing using MPI: A Dynamic Load-balancing Approach S. M. Ashraful Kadir 1 and Tazrian Khan 2 1 Scientific Computing, Royal Institute of Technology (KTH), Stockholm, Sweden [email protected],
Generalized linear models and software for network meta-analysis
Generalized linear models and software for network meta-analysis Sofia Dias & Gert van Valkenhoef Tufts University, Boston MA, USA, June 2012 Generalized linear model (GLM) framework Pairwise Meta-analysis
Bayesian Statistics in One Hour. Patrick Lam
Bayesian Statistics in One Hour Patrick Lam Outline Introduction Bayesian Models Applications Missing Data Hierarchical Models Outline Introduction Bayesian Models Applications Missing Data Hierarchical
Accelerating BIRCH for Clustering Large Scale Streaming Data Using CUDA Dynamic Parallelism
Accelerating BIRCH for Clustering Large Scale Streaming Data Using CUDA Dynamic Parallelism Jianqiang Dong, Fei Wang and Bo Yuan Intelligent Computing Lab, Division of Informatics Graduate School at Shenzhen,
High Performance Matrix Inversion with Several GPUs
High Performance Matrix Inversion on a Multi-core Platform with Several GPUs Pablo Ezzatti 1, Enrique S. Quintana-Ortí 2 and Alfredo Remón 2 1 Centro de Cálculo-Instituto de Computación, Univ. de la República
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts
A Dynamic Resource Management with Energy Saving Mechanism for Supporting Cloud Computing
A Dynamic Resource Management with Energy Saving Mechanism for Supporting Cloud Computing Liang-Teh Lee, Kang-Yuan Liu, Hui-Yang Huang and Chia-Ying Tseng Department of Computer Science and Engineering,
Imputing Missing Data using SAS
ABSTRACT Paper 3295-2015 Imputing Missing Data using SAS Christopher Yim, California Polytechnic State University, San Luis Obispo Missing data is an unfortunate reality of statistics. However, there are
A Bayesian Antidote Against Strategy Sprawl
A Bayesian Antidote Against Strategy Sprawl Benjamin Scheibehenne ([email protected]) University of Basel, Missionsstrasse 62a 4055 Basel, Switzerland & Jörg Rieskamp ([email protected])
Scaling Bayesian Network Parameter Learning with Expectation Maximization using MapReduce
Scaling Bayesian Network Parameter Learning with Expectation Maximization using MapReduce Erik B. Reed Carnegie Mellon University Silicon Valley Campus NASA Research Park Moffett Field, CA 94035 [email protected]
Cost Model: Work, Span and Parallelism. 1 The RAM model for sequential computation:
CSE341T 08/31/2015 Lecture 3 Cost Model: Work, Span and Parallelism In this lecture, we will look at how one analyze a parallel program written using Cilk Plus. When we analyze the cost of an algorithm
APPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder
APPM4720/5720: Fast algorithms for big data Gunnar Martinsson The University of Colorado at Boulder Course objectives: The purpose of this course is to teach efficient algorithms for processing very large
Text Mining Approach for Big Data Analysis Using Clustering and Classification Methodologies
Text Mining Approach for Big Data Analysis Using Clustering and Classification Methodologies Somesh S Chavadi 1, Dr. Asha T 2 1 PG Student, 2 Professor, Department of Computer Science and Engineering,
STUDY AND SIMULATION OF A DISTRIBUTED REAL-TIME FAULT-TOLERANCE WEB MONITORING SYSTEM
STUDY AND SIMULATION OF A DISTRIBUTED REAL-TIME FAULT-TOLERANCE WEB MONITORING SYSTEM Albert M. K. Cheng, Shaohong Fang Department of Computer Science University of Houston Houston, TX, 77204, USA http://www.cs.uh.edu
Load balancing in a heterogeneous computer system by self-organizing Kohonen network
Bull. Nov. Comp. Center, Comp. Science, 25 (2006), 69 74 c 2006 NCC Publisher Load balancing in a heterogeneous computer system by self-organizing Kohonen network Mikhail S. Tarkov, Yakov S. Bezrukov Abstract.
Course: Model, Learning, and Inference: Lecture 5
Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 [email protected] Abstract Probability distributions on structured representation.
RevoScaleR Speed and Scalability
EXECUTIVE WHITE PAPER RevoScaleR Speed and Scalability By Lee Edlefsen Ph.D., Chief Scientist, Revolution Analytics Abstract RevoScaleR, the Big Data predictive analytics library included with Revolution
Psychology 209. Longitudinal Data Analysis and Bayesian Extensions Fall 2012
Instructor: Psychology 209 Longitudinal Data Analysis and Bayesian Extensions Fall 2012 Sarah Depaoli ([email protected]) Office Location: SSM 312A Office Phone: (209) 228-4549 (although email will
MapReduce and Distributed Data Analysis. Sergei Vassilvitskii Google Research
MapReduce and Distributed Data Analysis Google Research 1 Dealing With Massive Data 2 2 Dealing With Massive Data Polynomial Memory Sublinear RAM Sketches External Memory Property Testing 3 3 Dealing With
Package snowfall. October 14, 2015
Type Package Title Easier cluster computing (based on snow). Version 1.84-6.1 Date 2013-12-18 Author Jochen Knaus Package snowfall October 14, 2015 Maintainer Jochen Knaus Description
Data Mining. SPSS Clementine 12.0. 1. Clementine Overview. Spring 2010 Instructor: Dr. Masoud Yaghini. Clementine
Data Mining SPSS 12.0 1. Overview Spring 2010 Instructor: Dr. Masoud Yaghini Introduction Types of Models Interface Projects References Outline Introduction Introduction Three of the common data mining
Stephen du Toit Mathilda du Toit Gerhard Mels Yan Cheng. LISREL for Windows: PRELIS User s Guide
Stephen du Toit Mathilda du Toit Gerhard Mels Yan Cheng LISREL for Windows: PRELIS User s Guide Table of contents INTRODUCTION... 1 GRAPHICAL USER INTERFACE... 2 The Data menu... 2 The Define Variables
A Performance Study of Load Balancing Strategies for Approximate String Matching on an MPI Heterogeneous System Environment
A Performance Study of Load Balancing Strategies for Approximate String Matching on an MPI Heterogeneous System Environment Panagiotis D. Michailidis and Konstantinos G. Margaritis Parallel and Distributed
PARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN
1 PARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN Introduction What is cluster computing? Classification of Cluster Computing Technologies: Beowulf cluster Construction
Automated Model Based Testing for an Web Applications
Automated Model Based Testing for an Web Applications Agasarpa Mounica, Lokanadham Naidu Vadlamudi Abstract- As the development of web applications plays a major role in our day-to-day life. Modeling the
Efficient Parallel Execution of Sequence Similarity Analysis Via Dynamic Load Balancing
Efficient Parallel Execution of Sequence Similarity Analysis Via Dynamic Load Balancing James D. Jackson Philip J. Hatcher Department of Computer Science Kingsbury Hall University of New Hampshire Durham,
Using SAS PROC MCMC to Estimate and Evaluate Item Response Theory Models
Using SAS PROC MCMC to Estimate and Evaluate Item Response Theory Models Clement A Stone Abstract Interest in estimating item response theory (IRT) models using Bayesian methods has grown tremendously
Detection of changes in variance using binary segmentation and optimal partitioning
Detection of changes in variance using binary segmentation and optimal partitioning Christian Rohrbeck Abstract This work explores the performance of binary segmentation and optimal partitioning in the
Dynamic resource management for energy saving in the cloud computing environment
Dynamic resource management for energy saving in the cloud computing environment Liang-Teh Lee, Kang-Yuan Liu, and Hui-Yang Huang Department of Computer Science and Engineering, Tatung University, Taiwan
A Comparative Performance Analysis of Load Balancing Algorithms in Distributed System using Qualitative Parameters
A Comparative Performance Analysis of Load Balancing Algorithms in Distributed System using Qualitative Parameters Abhijit A. Rajguru, S.S. Apte Abstract - A distributed system can be viewed as a collection
Facilitating Consistency Check between Specification and Implementation with MapReduce Framework
Facilitating Consistency Check between Specification and Implementation with MapReduce Framework Shigeru KUSAKABE, Yoichi OMORI, and Keijiro ARAKI Grad. School of Information Science and Electrical Engineering,
Scalable Parallel Clustering for Data Mining on Multicomputers
Scalable Parallel Clustering for Data Mining on Multicomputers D. Foti, D. Lipari, C. Pizzuti and D. Talia ISI-CNR c/o DEIS, UNICAL 87036 Rende (CS), Italy {pizzuti,talia}@si.deis.unical.it Abstract. This
Performance Analysis of Web based Applications on Single and Multi Core Servers
Performance Analysis of Web based Applications on Single and Multi Core Servers Gitika Khare, Diptikant Pathy, Alpana Rajan, Alok Jain, Anil Rawat Raja Ramanna Centre for Advanced Technology Department
Classification On The Clouds Using MapReduce
Classification On The Clouds Using MapReduce Simão Martins Instituto Superior Técnico Lisbon, Portugal [email protected] Cláudia Antunes Instituto Superior Técnico Lisbon, Portugal [email protected]
Cluster Computing at HRI
Cluster Computing at HRI J.S.Bagla Harish-Chandra Research Institute, Chhatnag Road, Jhunsi, Allahabad 211019. E-mail: [email protected] 1 Introduction and some local history High performance computing
Building patient-level predictive models Martijn J. Schuemie, Marc A. Suchard and Patrick Ryan 2015-11-01
Building patient-level predictive models Martijn J. Schuemie, Marc A. Suchard and Patrick Ryan 2015-11-01 Contents 1 Introduction 1 1.1 Specifying the cohort of interest and outcomes..........................
TPCalc : a throughput calculator for computer architecture studies
TPCalc : a throughput calculator for computer architecture studies Pierre Michaud Stijn Eyerman Wouter Rogiest IRISA/INRIA Ghent University Ghent University [email protected] [email protected]
BIG DATA What it is and how to use?
BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14
Software Review: ITSM 2000 Professional Version 6.0.
Lee, J. & Strazicich, M.C. (2002). Software Review: ITSM 2000 Professional Version 6.0. International Journal of Forecasting, 18(3): 455-459 (June 2002). Published by Elsevier (ISSN: 0169-2070). http://0-
CUDAMat: a CUDA-based matrix class for Python
Department of Computer Science 6 King s College Rd, Toronto University of Toronto M5S 3G4, Canada http://learning.cs.toronto.edu fax: +1 416 978 1455 November 25, 2009 UTML TR 2009 004 CUDAMat: a CUDA-based
A Simultaneous Solution for General Linear Equations on a Ring or Hierarchical Cluster
Acta Technica Jaurinensis Vol. 3. No. 1. 010 A Simultaneous Solution for General Linear Equations on a Ring or Hierarchical Cluster G. Molnárka, N. Varjasi Széchenyi István University Győr, Hungary, H-906
New Hash Function Construction for Textual and Geometric Data Retrieval
Latest Trends on Computers, Vol., pp.483-489, ISBN 978-96-474-3-4, ISSN 79-45, CSCC conference, Corfu, Greece, New Hash Function Construction for Textual and Geometric Data Retrieval Václav Skala, Jan
Model-Based Cluster Analysis for Web Users Sessions
Model-Based Cluster Analysis for Web Users Sessions George Pallis, Lefteris Angelis, and Athena Vakali Department of Informatics, Aristotle University of Thessaloniki, 54124, Thessaloniki, Greece [email protected]
Package biganalytics
Package biganalytics February 19, 2015 Version 1.1.1 Date 2012-09-20 Title A library of utilities for big.matrix objects of package bigmemory. Author John W. Emerson and Michael
Optimal Service Pricing for a Cloud Cache
Optimal Service Pricing for a Cloud Cache K.SRAVANTHI Department of Computer Science & Engineering (M.Tech.) Sindura College of Engineering and Technology Ramagundam,Telangana G.LAKSHMI Asst. Professor,
Imperfect Debugging in Software Reliability
Imperfect Debugging in Software Reliability Tevfik Aktekin and Toros Caglar University of New Hampshire Peter T. Paul College of Business and Economics Department of Decision Sciences and United Health
On Correlating Performance Metrics
On Correlating Performance Metrics Yiping Ding and Chris Thornley BMC Software, Inc. Kenneth Newman BMC Software, Inc. University of Massachusetts, Boston Performance metrics and their measurements are
Package bcrm. September 24, 2015
Type Package Package bcrm September 24, 2015 Title Bayesian Continual Reassessment Method for Phase I Dose-Escalation Trials Version 0.4.5 Date 2015-09-23 Author Michael Sweeting Maintainer Michael Sweeting
Planning the Installation and Installing SQL Server
Chapter 2 Planning the Installation and Installing SQL Server In This Chapter c SQL Server Editions c Planning Phase c Installing SQL Server 22 Microsoft SQL Server 2012: A Beginner s Guide This chapter
Lecture/Recitation Topic SMA 5303 L1 Sampling and statistical distributions
SMA 50: Statistical Learning and Data Mining in Bioinformatics (also listed as 5.077: Statistical Learning and Data Mining ()) Spring Term (Feb May 200) Faculty: Professor Roy Welsch Wed 0 Feb 7:00-8:0
A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data
A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data Faming Liang University of Florida August 9, 2015 Abstract MCMC methods have proven to be a very powerful tool for analyzing
Characteristics of Global Calling in VoIP services: A logistic regression analysis
www.ijcsi.org 17 Characteristics of Global Calling in VoIP services: A logistic regression analysis Thanyalak Mueangkhot 1, Julian Ming-Sung Cheng 2 and Chaknarin Kongcharoen 3 1 Department of Business
Advances in Smart Systems Research : ISSN 2050-8662 : http://nimbusvault.net/publications/koala/assr/ Vol. 3. No. 3 : pp.
Advances in Smart Systems Research : ISSN 2050-8662 : http://nimbusvault.net/publications/koala/assr/ Vol. 3. No. 3 : pp.49-54 : isrp13-005 Optimized Communications on Cloud Computer Processor by Using
HPC Cloud. Focus on your research. Floris Sluiter Project leader SARA
HPC Cloud Focus on your research Floris Sluiter Project leader SARA Why an HPC Cloud? Christophe Blanchet, IDB - Infrastructure Distributing Biology: Big task to port them all to your favorite architecture
Load Distribution on a Linux Cluster using Load Balancing
Load Distribution on a Linux Cluster using Load Balancing Aravind Elango M. Mohammed Safiq Undergraduate Students of Engg. Dept. of Computer Science and Engg. PSG College of Technology India Abstract:
Elemental functions: Writing data-parallel code in C/C++ using Intel Cilk Plus
Elemental functions: Writing data-parallel code in C/C++ using Intel Cilk Plus A simple C/C++ language extension construct for data parallel operations Robert Geva [email protected] Introduction Intel
A Divided Regression Analysis for Big Data
Vol., No. (0), pp. - http://dx.doi.org/0./ijseia.0...0 A Divided Regression Analysis for Big Data Sunghae Jun, Seung-Joo Lee and Jea-Bok Ryu Department of Statistics, Cheongju University, 0-, Korea [email protected],
Your Best Next Business Solution Big Data In R 24/3/2010
Your Best Next Business Solution Big Data In R 24/3/2010 Big Data In R R Works on RAM Causing Scalability issues Maximum length of an object is 2^31-1 Some packages developed to help overcome this problem
PERFORMANCE ENHANCEMENTS IN TreeAge Pro 2014 R1.0
PERFORMANCE ENHANCEMENTS IN TreeAge Pro 2014 R1.0 15 th January 2014 Al Chrosny Director, Software Engineering TreeAge Software, Inc. [email protected] Andrew Munzer Director, Training and Customer
Speed Performance Improvement of Vehicle Blob Tracking System
Speed Performance Improvement of Vehicle Blob Tracking System Sung Chun Lee and Ram Nevatia University of Southern California, Los Angeles, CA 90089, USA [email protected], [email protected] Abstract. A speed
Overlapping Data Transfer With Application Execution on Clusters
Overlapping Data Transfer With Application Execution on Clusters Karen L. Reid and Michael Stumm [email protected] [email protected] Department of Computer Science Department of Electrical and Computer
A Comparison of Distributed Systems: ChorusOS and Amoeba
A Comparison of Distributed Systems: ChorusOS and Amoeba Angelo Bertolli Prepared for MSIT 610 on October 27, 2004 University of Maryland University College Adelphi, Maryland United States of America Abstract.
Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com
SPSS-SA Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Training Brochure 2009 TABLE OF CONTENTS 1 SPSS TRAINING COURSES FOCUSING
Implementing Parameterized Dynamic Load Balancing Algorithm Using CPU and Memory
Implementing Parameterized Dynamic Balancing Algorithm Using CPU and Memory Pradip Wawge 1, Pritish Tijare 2 Master of Engineering, Information Technology, Sipna college of Engineering, Amravati, Maharashtra,
