Grid Computing FUNDAMENTALS OF. Theory, Algorithms and Technologies. Frederic Magoules. Edited by. CRC Press

FUNDAMENTALS OF Grid Computing Theory, Algorithms and Technologies Edited by Frederic Magoules CRC Press Taylor & Francis Group Boca Raton London NewYork CRC Press is an imprint of the Taylor 8t Francis Croup, an Informa business A CHAPMAN St HALL BOOK

Contents List of figures xiii List of tables xvii Foreword xix Preface xxi Warranty xxiii 1 Grid computing overview 1 Fre'de'ric Magoules, Thi-Mai-Huong Nguyen, and Lei Yu 11 Introduction 1 12 Definitions 2 13 Classifying grid systems 3 14 Grid applications 4 15 Grid architecture 5 16 Grid computing projects 6 161 Grid middleware (core services) 6 162 Grid resource brokers and schedulers 11 163 Grid systems 14 164 Grid programming environments 16 165 Grid portals 18 17 Grid evolution 22 18 Concluding remarks 23 19 References 24 2 Synchronization protocols for sharing resources in grid envi ronments 29 Julien Sopena, Luciano, Arcmtes, Fabrice Legond-Aubry, and Pierre Sens 21 Introduction 29 22 Token-based mutual exclusion algorithms 31 221 Martin's algorithm 31 222 Naimi-Trehel's algorithm 33 223 Suzuki-Kasami's algorithm 34 23 Mutual exclusion algorithms for large configurations 36 231 Priority-based approach 36

to 39 232 Composition-based approach 37 24 Composition approach to mutual exclusion algorithms 241 Coordinator processes 41 25 Composition properties and its natural effects 43 251 Filtering and aggregation 43 252 Preemption and structural effects 45 253 Natural effects of composition 46 26 Performance evaluation 47 261 Experiment parameters 47 262 Performance results: composition study 49 263 The impact of the grid architecture 56 27 Concluding remarks 62 28 References 63 3 Data replication in grid environments 67 Thi-Mai-Huong Nguyen and Frederic Magoules 31 Introduction 67 32 Data replication 68 321 Replication in databases 69 322 Replication in peer-to-peer systems 70 323 Replication in web environments 71 324 Replication in data grids 72 33 System architecture 76 34 Selective-rank model for a replication system 78 341 Model assumptions 79 342 Estimating the availability of files 80 343 Problem definition 80 35 Selective-rank replication algorithm 82 351 Popularity of files 82 352 Correlation of files 82 353 MaxDAR optimizer algorithm 83 36 Evaluation 85 361 Grid configuration 87 362 Experimental results 87 37 Concluding remarks 94 38 References 95 4 Data management in grids 101 Jean-Marc Pierson 41 Introduction 101 42 From data sources to databases data sources 103 43 Positioning the data management in grids within distributed systems 104 44 Links with the other services of the middleware 106 45 Problems and some solutions 107

451 Data identification, indexing, metadata 107 452 Data access, interoperability, query processing, transac tions 109 453 Transport Ill 454 Placement, replication, caching 112 455 Security: transport, authentication, access control, en cryption 113 456 Consistency 115 46 Toward pervasive, autonomic and on-demand data manage ment 116 47 Concluding remarks 117 48 References 118 5 Future of grids resources management 125 Fei Teng and Frederic Magoules 51 Introduction 125 52 Several computing paradigms 126 521 Utility computing 126 522 Grid computing 127 523 Autonomic computing 127 524 Cloud computing 128 53 Definition of cloud computing 129 531 One definition 129 532 Architecture 130 54 Cloud services 130 541 Three-level services 130 542 Service characters 132 55 Cloud resource management 134 551 Comparison with grid systems 134 552 Resource model 135 553 Economy-oriented model 136 56 Future direction of resource scheduling 137 561 Scalable and dynamic 138 562 Secure and trustable 138 563 Virtual machines-based 138 57 Concluding remarks 139 58 References 140 6 Fault-tolerance and availability awareness in computational grids 143 Xavier Besseron, Mohamed-SUm Bouguerra, Thierry Gautier, and Denis Trystram Erik Saule, 61 Introduction 143 62 Background and definitions 146 621 Grid architecture and execution model 147

622 Faults models 148 623 Consistent system states 148 63 Multi-objective scheduling for safety 149 631 Generalities 149 632 No duplication 150 633 Using duplication 152 64 Stable memory-based protocols 153 641 Log-based rollback recovery 153 642 Checkpoint-based rollback recovery 155 65 Stochastic checkpoint model analysis issues 156 651 Completion time without fault tolerance 157 652 Impact of checkpointing on the completion time 159 66 Implementations 163 661 Single process snapshot 164 662 Fault-tolerance protocol implementations 164 663 Implementation comparison 166 67 Concluding remarks 168 68 References 170 7 Fault tolerance for distributed scheduling in grids 177 Lei Yu and Frederic Magoules 71 Introduction 177 72 Fault tolerance in distributed systems 179 73 Distributed scheduling model 180 731 MMS fault tolerance 180 732 LMS/SMS fault tolerance 181 733 CR fault tolerance 182 74 Fault detection and repairing in the tree structure 183 741 Notations 183 742 Algorithms description 183 743 Messages treatment analysis 188 75 Distributed scheduling algorithm 189 751 Distributed dynamic scheduling algorithm with fault tolerance (DDFT) 189 752 Algorithm fault tolerance issues 190 76 SimGrid and simulation design 191 77 Evaluation 192 771 Simulation setup 193 772 Comparison with centralized scheduling 193 773 Fault tolerance experiments 197 774 Workload analysis 197 78 Related work 199 79 Concluding remarks 200 710 References 201

Broadcasting for grids 207 Christophe Cerin, Luiz-Angelo Steffenel, and Hazem Fkaier 81 Introduction 207 82 Broadcastings 208 83 Heuristics for broadcasting 211 831 Basic approaches for broadcasting in homogeneous en vironments 212 832 Advanced approaches for heterogeneous clusters 213 833 Grid aware heuristics 214 834 New approach for broadcasting in clusters and hyper clusters 215 84 Related work and related methods 220 841 Broadcasting and dynamic programming 220 842 Multi-criteria approach 223 843 Broadcast for clusters 228 844 Broadcast and heterogeneous systems 230 85 Concluding remarks 230 86 References 232 Load balancing algorithms for dynamic Jacques M Bahi, Raphael Couturier, networks 235 and Abderrahmane Sider 91 Introduction 235 92 A taxonomy for load balancing 237 93 Distributed load balancing algorithms for static networks 240 931 Network model and performance measures 240 932 Diffusion 242 933 Dimension exchange 246 934 GDE 248 935 Second order algorithms 250 94 Distributed load balancing algorithms for dynamic networks 250 941 Adaption to dynamic networks 251 942 Generalized adaptive exchange (GAE) 251 943 Illustrating the generalized adaptive exchange most to least loaded policy on a dynamic network 255 95 Implementation 257 951 On synchronous and asynchronous approaches 257 952 How to define the load for some applications 259 953 Implementation of static algorithms 259 954 Implementation of dynamic algorithms 260 96 A practical example: the advection diffusion application 261 961 Load balancing and the application 264 962 Load balancing in a dynamic network 266 97 Concluding remarks 268 98 References 269

A Implementation of the replication strategies in OptorSim 273 Thi-Mai-Huong Nguyen and Frederic Magoules Al Introduction 273 A2 Download 274 A3 Implementation 274 A31 OptorSim implementation 274 A 32 MaxDAR implementation 275 A 4 How to execute the simulation 276 B Implementation of the simulator for the distributed schedul ing model 279 Lei Yu and Frederic Magoules B l Introduction 279 B2 Download B3 Implementation 280 B 31 Data structures 280 B32 Functions 280 B4 How to execute the simulation 282 ' 279 Glossary 283 Author Index 297