Leistungsanalyse von Rechnersystemen
|
|
- Buddy Montgomery
- 8 years ago
- Views:
Transcription
1 Center for Information Services and High Performance Computing (ZIH) Leistungsanalyse von Rechnersystemen 29. Oktober 2008 Nöthnitzer Straße 46 Raum 1026 Tel Holger Brunst (holger.brunst@tu-dresden.de) Matthias S. Mueller (matthias.mueller@tu-dresden.de) Summary of Previous Lecture (1) Remarks: Doherty (1970) Performance is the degree to which a computing system meets expectations of the persons involved in it. Main objective: Get highest performance for a given cost System: An arbitrary collection of hardware, software, and firmware: e.g. CPU, database, network of computers Metric: A criteria used to evaluate the performance of a system: e.g. response time, throughput, FLOPS Workload: The overall sum of user requests to a system e.g.: CPU workload: Collection of instructions to execute 1
2 Summary of Previous Lecture (2) Discussion of performance analysis examples and questions Selection of technique, metric, and workload Correctness of performance measurements Measurement and simulation design The art of performance analysis Successful evaluation cannot be produced mechanically Evaluation requires detailed knowledge of the system to be modeled Summary of Previous Lecture (3) Knowledge of common mistakes and games is important for choosing the right methodology as an analyst; questioning offers, recommendations, and advertisements as a consumer, buying agent, or decision maker Classes of common mistakes: Goals Methodology Completeness Analysis Presentation Checklist for avoiding problems Systematic approach to performance evaluation 2
3 Summary of Previous Lecture: Questions What does performance mean? What are the main reasons to do a performance analysis? What are the main tasks? What s a system in performance analysis terminology? What do the terms metric and workload stand for? What s a performance parameter? What s a performance factor? Center for Information Services and High Performance Computing (ZIH) Parallel Metrics Nöthnitzer Straße 46 Raum 1026 Tel Holger Brunst (holger.brunst@tu-dresden.de) Matthias S. Mueller (matthias.mueller@tu-dresden.de) 3
4 Excursion on Speedup and Efficiency Metrics Comparison of sequential and parallel algorithms Speedup: S n = T 1 T n n is the number of processors T 1 is the execution time of the sequential algorithm T n is the execution time of the parallel algorithm with n processors Efficiency: E p = S p p Its value estimates how well-utilized p processors solve a given problem Usually between zero and one. Exception: Super linear speedup (later) Amdahl s Law Find the maximum expected improvement to an overall system when only part of the system is improved Serial execution time = s+p Parallel execution time = s+p/n S n = s + p s + p n Normalizing with respect to serial time (s+p) = 1 results in: S n = 1/(s+p/n) Drops off rapidly as serial fraction increases Maximum speedup possible = 1/s, independent of n the number of processors! Bad news: If an application has only 1% serial work (s = 0.01) then you will never see a speedup greater than 100. So, why do we build system with more than 100 processors? What is wrong with this argument? 4
5 Scaled Speedup (Gustafson-Barsis Law) Amdahl s speedup equation assumes p is independent of n, in other words the problem size remains the same Gustafson-Barsis law states that any sufficiently large problem can be efficiently parallelized More realistic to assume runtime remains the same, NOT the problem size If the problem size scales up, does the serial part also increase? Parallel execution time = s+p Serial execution time = s+np S sn = s + pn s + p Normalizing with respect to parallel execution time results in: S sn = n+(1-n) s = p(n-1) + 1 Efficiency and Serial Fraction Strong scalability vs. weak scalability E n = S n /n, does not tell the whole story is it necessarily bad if efficiency drops as you increase n for a given problem size? s is supposed to be a constant this assumes work is load balanced no overhead for synchronizing the processors Experimentally measure the serial fraction if s does not remain constant, what can we discern? 5
6 Superlinear/Superunitary Speedup Work in algorithm = W real +W ovhd What is W ovhd? Super-unitary speedup possible if total work done by n processors is strictly less than that done by a single processor Reasons for super-unitary speedup Memory and cache effects Dividing up resource management overheads Hiding latency for remote operations Randomized algorithms In literature superlinear speedup is sometime also referred to us superunitary speedup which might be mathematically more correct Center for Information Services and High Performance Computing (ZIH) Workload types, selection and characterization Nöthnitzer Straße 46 Raum 1026 Tel Holger Brunst (holger.brunst@tu-dresden.de) Matthias S. Mueller (matthias.mueller@tu-dresden.de) 6
7 Types of Workloads Test workload: Any workload used in performance studies Real or synthetic Real workload: Observed on a system being used for normal operation Cannot be repeated May contain sensitive data Synthetic workload: Should be representative for a real workload Often smaller in size Historical examples for test workloads Addition instruction Instruction mixes Kernels Synthetic programs Application benchmarks 7
8 Popular benchmarks: Eratosthenes sieve algorithm Algorithm to find prime numbers Kernel Simple An algorithm is always independent of a computer language or specific implementation No very representative of today's use of computers Popular benchmarks: Ackermann s Function Ackermann(n,m) := n+1 if m=0 Ackermann(m-1,1) if n=0 Ackermann(m-1, Ackermann(m,n-1)) Used to assess the efficiency of procedure calls Ackermann(3,n) requires (512*4**(n-1)-15*2**(n+3)+9*n+37)/3 calls and a stack size 2**(n+3)-4 8
9 Popular benchmarks: Whetstone Used at British Central Computer Agency 11 modules Representative f 949 ALGOL programs Available in ALGOL, FORTRAN, PL/I and other programs See Curnow and Wichmann (1975) Results in KWHIPS (Kilo Whetstone Instructions Per Second) Workloads characteristics: Floating point intensive Cache friendly No I/O Popular benchmarks: LINPACK Developed by Jack Dongarra (1983) at ANL (now ICL, UTK) Solves a dense system of linear equations Algorithmic definition of the benchmark Reference implementation available (HPL) Makes have use of BLAS One fixed dataset: 100x100 Used as the benchmark for the TOP500 list Many vendors have its own hand-tuned implementation 9
10 Popular benchmarks: Dhrystone Developed in 1984 by Reinhold Weicker at Siemens Represents systems programming environments Available in C, Pascal and Ada Results are in Dhrystone Instructions Per Seconds (DIPS) Includes ground rules for building and executing Dhrystone (run rules) Popular Benchmarks: Lawrence Livermore Loops 24 separate tests Largely vectorizable Assembled at LLNL (see McMahon 1986) 10
11 Popular Benchmarks: Transaction Processing (TPC-C) Successor of the Debit-Credit Benchmark TPC-C is an on-line transaction processing benchmark Results reports performance (tpmc) and price/performance ($/tmpc) System reported has to be available to the customer (at that price) Running the benchmarks requires a costly setup: SPEC groups and benchmarks Open Systems Group (desktop systems, high-end workstations and servers) CPU (CPU benchmarks) JAVA (java client and server side benchmarks) MAIL (mail server benchmarks) SFS (file server benchmarks) WEB (web Server benchmarks) High Performance Group (HPC systems) OMP (OpenMP benchmark) HPC (HPC application benchmark) MPI (MPI application benchmark) Graphics Performance Groups (Graphics) Apc (Graphics application benchmarks) Opc (OpenGL performance benchmarks) 11
12 Center for Information Services and High Performance Computing (ZIH) Workload Selection Nöthnitzer Straße 46 Raum 1026 Tel Holger Brunst (holger.brunst@tu-dresden.de) Matthias S. Mueller (matthias.mueller@tu-dresden.de) System under Study Seems to be an easy thing to define Be aware of different abstraction layers Example ISO/OSI reference model for computer networks: 1. Application (mail, FTP) 2. Presentation (Data compression,..) 3. Session (Dialogs) 4. Transport (Messages) 5. Network (Packets) 6. Datalink (Frames) 7. Physical (Bits) 12
13 Level of Detail of the workload description Examples: Most frequent request (e.g. Addition) Frequency of request type (instruction mix) Time-stamped sequence of requests Average resource demand (e.g. 20 I/O requests per second) Distribution of resource demands (not only the average, but also probability distribution) Representativeness After all benchmarks are not a merit of their own, they should represent real workloads: Different characteristics to consider: Arrival rate of requests Resource demands Resource usage profile (sequence and amounts of resources used by an application) To be representative a test workload has to follow the user behavior in a timely fashion!!! 13
14 Center for Information Services and High Performance Computing (ZIH) SPEC Benchmarks Vorlesung Leistungsanalyse Nöthnitzer Straße 46 Raum 1026 Tel Holger Brunst (holger.brunst@tu-dresden.de) Matthias S. Mueller (matthias.mueller@tu-dresden.de) Outline What is SPEC? Who is SPEC? Some SPEC benchmarks: SPEC CPU SPEC HPC SPEC OMP SPEC MPI Summary 14
15 Center for Information Services and High Performance Computing (ZIH) What and who is SPEC? Nöthnitzer Straße 46 Raum 1026 Tel Holger Brunst (holger.brunst@tu-dresden.de) Matthias S. Mueller (matthias.mueller@tu-dresden.de) What is SPEC? The Standard Performance Evaluation Corporation (SPEC) is a non-profit corporation formed to establish, maintain and endorse a standardized set of relevant benchmarks that can be applied to the newest generation of highperformance computers. SPEC develops suites of benchmarks and also reviews and publishes submitted results from our member organizations and other benchmark licensees. For more details see 15
16 SPEC Members SPEC Members: 3DLabs * Acer Inc. * Advanced Micro Devices * Apple Computer, Inc. * ATI Research * Azul Systems, Inc. * BEA Systems * Borland * Bull S.A. * CommuniGate Systems * Dell * EMC * Exanet * Fabric7 Systems, Inc. * Freescale Semiconductor, Inc. * Fujitsu Limited * Fujitsu Siemens * Hewlett-Packard * Hitachi Data Systems * Hitachi Ltd. * IBM * Intel * ION Computer Systems * JBoss * Microsoft * Mirapoint * NEC - Japan * Network Appliance * Novell * NVIDIA * Openwave Systems * Oracle * P.A. Semi * Panasas * PathScale * The Portland Group * S3 Graphics Co., Ltd. * SAP AG * SGI * Sun Microsystems * Super Micro Computer, Inc. * Sybase * Symantec Corporation * Unisys * Verisign * Zeus Technology * SPEC Associates: California Institute of Technology * Center for Scientific Computing (CSC) * Defence Science and Technology Organisation - Stirling * Dresden University of Technology * Duke University * JAIST * Kyushu University * Leibniz Rechenzentrum - Germany * National University of Singapore * New South Wales Department of Education and Training * Purdue University * Queen's University * Rightmark * Stanford University * Technical University of Darmstadt * Texas A&M University * Tsinghua University * University of Aizu - Japan * University of California - Berkeley * University of Central Florida * University of Illinois - NCSA * University of Maryland * University of Modena * University of Nebraska, Lincoln * University of New Mexico * University of Pavia * University of Stuttgart * University of Texas at Austin * University of Texas at El Paso * University of Tsukuba * University of Waterloo * VA Austin Automation Center * SPEC members in Dresden: Workshop June
17 SPEC groups Open Systems Group (desktop systems, high-end workstations and servers) CPU (CPU benchmarks) JAVA (java client and server side benchmarks) MAIL (mail server benchmarks) SFS (file server benchmarks) WEB (web Server benchmarks) High Performance Group (HPC systems) OMP (OpenMP benchmark) HPC (HPC application benchmark) MPI (MPI application benchmark) Graphics Performance Groups (Graphics) Apc (Graphics application benchmarks) Opc (OpenGL performance benchmarks) SPEC HPG = SPEC High-Performance Group Founded in 1994 Mission: To establish, maintain, and endorse a suite of benchmarks that are representative of real-world highperformance computing applications. SPEC/HPG includes members from both industry and academia. Benchmark products: SPEC OMP (OMPM2001, OMPL2001) SPEC HPC2002 released at SC 2002 SPEC MPI (under development) 17
18 Currently active SPEC HPG Members Fujitsu HP IBM Intel SGI SUN UNISYS University of Purdue Technische Universität Dresden HPG (High Performance Group) Benchmark Suites MPI2007 OMP2001 OMPL2001 HPC96 HPC2002 Founding of SPEC HPG Jan June 2001 June 2002 Jan
19 Center for Information Services and High Performance Computing (ZIH) Overview and Positioning Nöthnitzer Straße 46 Raum 1026 Tel Holger Brunst (holger.brunst@tu-dresden.de) Matthias S. Mueller (matthias.mueller@tu-dresden.de) Where is SPEC Relative to Other Benchmarks? There are many metrics, each one has its purpose Computer Hardware Raw machine performance: Tflops Microbenchmarks: Stream Algorithmic benchmarks: Linpack Compact Apps/Kernels: NAS benchmarks Application Suites: SPEC User-specific applications: Custom benchmarks Applications 19
20 Why do we need benchmarks? Identify problems: measure machine properties Time evolution: verify that we make progress Coverage: Help the vendors to have representative codes: Increase competition by transparency Drive future development (see SPEC CPU2000) Relevance: Help the customers to choose the right computer Comparison of different benchmark classes coverage relevance Identify problems Time evolution Micro Algorithmic Kernels SPEC Apps
21 Center for Information Services and High Performance Computing (ZIH) Nöthnitzer Straße 46 Raum 1026 Tel SPEC CPU 2006 From John Henning s talk at SPEC Workshop June 2007, Dresden Holger Brunst (holger.brunst@tu-dresden.de) Matthias S. Mueller (matthias.mueller@tu-dresden.de) SPEC CPU2006 History Released August 2006 Replaces CPU2000 (retired February 2007) 5th CPU benchmark SPECmark (later called CPU89 ) SPEC92 (later called CPU92 ) CPU95 CPU2000 CPU2006 Note: these updates are required to stay representative Question to the audience: What kind of application would you add? 21
22 CINT 2006 Benchmark L Application Area Brief Description 400.perlbench C Programming Language Derived from Perl V The workload includes SpamAssassin, MHonArc (an indexer), and specdiff (SPEC's tool that checks benchmark outputs). 401.bzip2 C Compression Julian Seward's bzip2 version 1.0.3, modified to do most work in memory, rather than doing I/O. 403.gcc C C-Compiler Based on gcc Version 3.2, generates code for Opteron. 429.mcf C Combinatorial Optim. Vehicle scheduling. Uses a network simplex algorithm (which is also used in commercial products) to schedule public transport. 445.gobmk C Artificial Intelligence: Go Plays the game of Go, a simply described but deeply complex game. 456.hmmer C Search Gene Sequence Protein sequence analysis using profile hidden Markov models (profile HMMs) 458.sjeng C AI: chess A highly-ranked chess program that also plays several chess variants. 462.libquantum C Physics Quantum Comp. Simulates a quantum computer, running Shor's polynomial-time factorization algorithm. 464.h264ref C Video Compression A reference implementation of H.264/AVC, encodes a videostream using 2 parameter sets. The H.264/AVC standard is expected to replace MPEG2 471.omnetpp C++ Discrete Event Simulation Uses the OMNet++ discrete event simulator to model a large Ethernet campus network. 473.astar C++ Path-finding Algorithms Pathfinding library for 2D maps, including the well known A* algorithm. 483.xalancbmk C++ XML Processing A modified version of Xalan-C++, which transforms XML documents to other document types. CFP 2006 (part I) Benchmark Lang. Application Area Brief Description 410.bwaves Fortran Fluid Dynamics Computes 3D transonic transient laminar viscous flow. 416.gamess Fortran Quantum Chemistry. Implements a wide range of quantum chemical computations. The SPEC workload does self-consistent field calculations using the Restricted Hartree Fock method, Restricted open-shell Hartree-Fock, and Multi- Configuration Self-Consistent Field 433.milc C Physics/QCD A gauge field generating program for lattice gauge theory with dynamical quarks. 434.zeusmp Fortran Physics / CFD ZEUS-MP is a computational fluid dynamics code developed at the Laboratory for Computational Astrophysics (NCSA, University of Illinois at Urbana-Champaign) for the simulation of astrophysical phenomena. 435.gromacs C, Fortran Biochemistry Molecular dynamics, i.e. simulate Newtonian equations of motion for hundreds to millions of particles. The test case simulates protein Lysozyme in a solution. 436.cactusADM C,Fortran Physics / General Relativity Solves the Einstein evolution equations using a staggered-leapfrog numerical method 437.leslie3d Fortran Fluid Dynamics Computational Fluid Dynamics (CFD) using Large-Eddy Simulations with Linear-Eddy Model in 3D. Uses MacCormack Predictor-Corrector time integration 444.namd C++ Biology Molecular Dynamics Simulates biomolecular systems. Test case has 92,224 atoms of apolipoprotein A-I. 447.dealII C++ FE Analysis deal.ii is a C++ library targeted at adaptive finite elements and error estimation. The testcase solves a Helmholtz-type equation with nonconstant coefficients. 22
23 CFP 2006 (part II) Benchmark Language Application Area Brief Description 450.soplex C++ Linear Programming, Solves a linear program using a simplex algorithm and sparse linear algebra. Test Optimization cases include railroad planning and military airlift models. 453.povray C++ Image Ray-tracing Image rendering. The testcase is a 1280x1024 antialiased image of a landscape with some abstract objects with textures using a Perlin noise function. 454.calculix C, F Structural Mechanics Finite element code for 3D structural applications. Uses the SPOOLES solver library. 459.GemsFDTD F Electromagnetics Solves Maxwell equations in 3D using finite-difference time-domain (FDTD) method. 465.tonto Fortran Quantum Chemistry An open source quantum chemistry package, using an object-oriented design in Fortran 95. The test case places a constraint on a molecular Hartree-Fock wavefunction calculation to better match experimental X-ray diffraction data. 470.lbm C Fluid Dynamics Implements the "Lattice-Boltzmann Method" to simulate incompressible fluids in 3D 481.wrf C,F Weather Weather modeling from scales of meters to thousands of kilometers. The test case is from a 30km area over 2 days. 482.sphinx3 C Speech recognition A widely-known speech recognition system from Carnegie Mellon University Code growth 23
24 Metrics Speed SPECint_base2006 (Required Base result) SPECint2006 (Optional Peak result) SPECfp_base2006 (Required Base result) SPECfp2006 (Optional Peak result) Throughput SPECint_rate_base2006 (Required Base result) SPECint_rate2006 (Optional Peak result) SPECfp_rate_base2006 (Required Base result) SPECfp_rate2006 (Optional Peak result) Speed Metric for Single Benchmark For each benchmark in suite, compute ratio vs. time on a reference system A 1997 Sun system with 296 MHz UltraSPARC II Similar but not identical to CPU2000 ref machine Example: 400.perlbench on a year 2006 imac took 948 seconds On the reference system, took 9770 seconds SPECratio = 10.3 (9770/948) If your workload looks like perl, you might find that this modern imac runs around 10x faster than a state-of-the-1997-art workstation. 24
25 Overall Speed Metric To obtain the overall speed metrics: geometric mean of the individual SPECratios Why geometric mean? Because this is the best answer to the question Without knowing how much time I will spend in text processing vs. network mapping vs. compiling vs. video compression, please tell me about how much faster this machine will be than the reference system. Motivation for Throughput Metric Differs from speed Stove analogy: One big flame cooks one big pot with one hogshead in one hour 6 little flames cook 6 little pots, each holding one firkin, in 15 minutes Which is better? Well, big flame does ~250 liters/hour; each little flame does only ~40 * 4 = 160 liters/hour 25
26 Throughput vs. Speed Big flame does ~250 liters/hour; each little flame does only ~40 * 4 = 160 liters/hour Alternatives: If I only need to heat up an UNOPENED container holding 1 gallon of soup, supper can be served most quickly if I put it on the big flame If I need to heat up one butt of soup (=2 hogsheads), and if I can open the container, I'd be better off using many small flames In IT business: Processing one image in Photoshop or Gimp vs. Rendering the next movie with thousands of pictures CPU2006 Throughput Metric Formula: the number of copies run * reference time for the benchmark / elapsed time in seconds Example: Sun Fire E25K runs 144 copies of 400.perlbench in1066 seconds: 144 * 9770 / 1066 =
27 Summary of Metrics Two different kind of metrics speed (single application turnaround) rate (thoughput) Run rules make the different between base and peak Base: conservative optimization, less freedom Peak: more aggressive optimization, more freedom Tow benchmark sets SPECint and SPECfp 2 3 = 8 different metrics If you look at the single application results you get: 2*2*(12+17)=116 different metics Example for Run Rules Base does not allow feedback directed optimization (still legal in peak) An unlimited number of flags may be set in base, Why? Because flag counting is not worth arguing about. For example, is -fast:np27 one flag, two, or three? Prove it. What if it's -fast_np27? What it it s fast np27 or fast np27? 27
28 SPEC CPU2000 Result Center for Information Services and High Performance Computing (ZIH) Thank You! Nöthnitzer Straße 46 Raum 1026 Tel Holger Brunst (holger.brunst@tu-dresden.de) Matthias S. Mueller (matthias.mueller@tu-dresden.de) 28
EPA Data Center Efficiency Workshop SPEC Benchmarks. March 27, 2006 Walter Bays, President, SPEC
EPA Data Center Efficiency Workshop SPEC Benchmarks March 27, 2006 Walter Bays, President, SPEC SPEC Background Benchmark wars of the 80's RISC vs. CISC Vendors & EE Times created SPEC for better benchmarks
More informationComputer Architecture
Computer Architecture Slide Sets WS 2013/2014 Prof. Dr. Uwe Brinkschulte M.Sc. Benjamin Betting Part 6 Fundamentals in Performance Evaluation Computer Architecture Part 6 page 1 of 22 Prof. Dr. Uwe Brinkschulte,
More informationAchieving QoS in Server Virtualization
Achieving QoS in Server Virtualization Intel Platform Shared Resource Monitoring/Control in Xen Chao Peng (chao.p.peng@intel.com) 1 Increasing QoS demand in Server Virtualization Data center & Cloud infrastructure
More informationTypes of Workloads. Raj Jain. Washington University in St. Louis
Types of Workloads Raj Jain Washington University in Saint Louis Saint Louis, MO 63130 Jain@cse.wustl.edu These slides are available on-line at: http://www.cse.wustl.edu/~jain/cse567-08/ 4-1 Overview!
More informationHow Much Power Oversubscription is Safe and Allowed in Data Centers?
How Much Power Oversubscription is Safe and Allowed in Data Centers? Xing Fu 1,2, Xiaorui Wang 1,2, Charles Lefurgy 3 1 EECS @ University of Tennessee, Knoxville 2 ECE @ The Ohio State University 3 IBM
More informationAn OS-oriented performance monitoring tool for multicore systems
An OS-oriented performance monitoring tool for multicore systems J.C. Sáez, J. Casas, A. Serrano, R. Rodríguez-Rodríguez, F. Castro, D. Chaver, M. Prieto-Matias Department of Computer Architecture Complutense
More informationCPU Performance Evaluation: Cycles Per Instruction (CPI) Most computers run synchronously utilizing a CPU clock running at a constant clock rate:
CPU Performance Evaluation: Cycles Per Instruction (CPI) Most computers run synchronously utilizing a CPU clock running at a constant clock rate: Clock cycle where: Clock rate = 1 / clock cycle f = 1 /C
More informationAnalysis of Memory Sensitive SPEC CPU2006 Integer Benchmarks for Big Data Benchmarking
Analysis of Memory Sensitive SPEC CPU2006 Integer Benchmarks for Big Data Benchmarking Kathlene Hurt and Eugene John Department of Electrical and Computer Engineering University of Texas at San Antonio
More informationCompiler-Assisted Binary Parsing
Compiler-Assisted Binary Parsing Tugrul Ince tugrul@cs.umd.edu PD Week 2012 26 27 March 2012 Parsing Binary Files Binary analysis is common for o Performance modeling o Computer security o Maintenance
More informationLecture 3: Evaluating Computer Architectures. Software & Hardware: The Virtuous Cycle?
Lecture 3: Evaluating Computer Architectures Announcements - Reminder: Homework 1 due Thursday 2/2 Last Time technology back ground Computer elements Circuits and timing Virtuous cycle of the past and
More informationCS 147: Computer Systems Performance Analysis
CS 147: Computer Systems Performance Analysis CS 147: Computer Systems Performance Analysis 1 / 39 Overview Overview Overview What is a Workload? Instruction Workloads Synthetic Workloads Exercisers and
More informationAn examination of the dual-core capability of the new HP xw4300 Workstation
An examination of the dual-core capability of the new HP xw4300 Workstation By employing single- and dual-core Intel Pentium processor technology, users have a choice of processing power options in a compact,
More informationLS DYNA Performance Benchmarks and Profiling. January 2009
LS DYNA Performance Benchmarks and Profiling January 2009 Note The following research was performed under the HPC Advisory Council activities AMD, Dell, Mellanox HPC Advisory Council Cluster Center The
More informationUsing GPUs in the Cloud for Scalable HPC in Engineering and Manufacturing March 26, 2014
Using GPUs in the Cloud for Scalable HPC in Engineering and Manufacturing March 26, 2014 David Pellerin, Business Development Principal Amazon Web Services David Hinz, Director Cloud and HPC Solutions
More informationApplication of a Development Time Productivity Metric to Parallel Software Development
Application of a Development Time Metric to Parallel Software Development Andrew Funk afunk@ll.mit.edu Victor Basili 2 basili@cs.umd.edu Lorin Hochstein 2 lorin@cs.umd.edu Jeremy Kepner kepner@ll.mit.edu
More informationReducing Dynamic Compilation Latency
LLVM 12 - European Conference, London Reducing Dynamic Compilation Latency Igor Böhm Processor Automated Synthesis by iterative Analysis The University of Edinburgh LLVM 12 - European Conference, London
More informationIntroducing EEMBC Cloud and Big Data Server Benchmarks
Introducing EEMBC Cloud and Big Data Server Benchmarks Quick Background: Industry-Standard Benchmarks for the Embedded Industry EEMBC formed in 1997 as non-profit consortium Defining and developing application-specific
More informationParallel Algorithm Engineering
Parallel Algorithm Engineering Kenneth S. Bøgh PhD Fellow Based on slides by Darius Sidlauskas Outline Background Current multicore architectures UMA vs NUMA The openmp framework Examples Software crisis
More informationA PERFORMANCE COMPARISON USING HPC BENCHMARKS: WINDOWS HPC SERVER 2008 AND RED HAT ENTERPRISE LINUX 5
A PERFORMANCE COMPARISON USING HPC BENCHMARKS: WINDOWS HPC SERVER 2008 AND RED HAT ENTERPRISE LINUX 5 R. Henschel, S. Teige, H. Li, J. Doleschal, M. S. Mueller October 2010 Contents HPC at Indiana University
More informationPerformance Monitoring of Parallel Scientific Applications
Performance Monitoring of Parallel Scientific Applications Abstract. David Skinner National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory This paper introduces an infrastructure
More informationHow Much Power Oversubscription is Safe and Allowed in Data Centers?
How Much Power Oversubscription is Safe and Allowed in Data Centers? Xing Fu, Xiaorui Wang University of Tennessee, Knoxville, TN 37996 The Ohio State University, Columbus, OH 43210 {xfu1, xwang}@eecs.utk.edu
More informationECLIPSE Best Practices Performance, Productivity, Efficiency. March 2009
ECLIPSE Best Practices Performance, Productivity, Efficiency March 29 ECLIPSE Performance, Productivity, Efficiency The following research was performed under the HPC Advisory Council activities HPC Advisory
More information2: Computer Performance
2: Computer Performance http://people.sc.fsu.edu/ jburkardt/presentations/ fdi 2008 lecture2.pdf... John Information Technology Department Virginia Tech... FDI Summer Track V: Parallel Programming 10-12
More informationPerformance Characterization of SPEC CPU2006 Integer Benchmarks on x86-64 64 Architecture
Performance Characterization of SPEC CPU2006 Integer Benchmarks on x86-64 64 Architecture Dong Ye David Kaeli Northeastern University Joydeep Ray Christophe Harle AMD Inc. IISWC 2006 1 Outline Motivation
More informationCompetitive Comparison Dual-Core Intel Xeon Processor-based Platforms vs. AMD Opteron*
Competitive Guide Dual-Core Intel Xeon Processor-based Systems Business Enterprise Competitive Comparison Dual-Core Intel Xeon Processor-based Platforms vs. AMD Opteron* Energy Efficient Performance Get
More informationWorkshop on Parallel and Distributed Scientific and Engineering Computing, Shanghai, 25 May 2012
Scientific Application Performance on HPC, Private and Public Cloud Resources: A Case Study Using Climate, Cardiac Model Codes and the NPB Benchmark Suite Peter Strazdins (Research School of Computer Science),
More informationSystem Models for Distributed and Cloud Computing
System Models for Distributed and Cloud Computing Dr. Sanjay P. Ahuja, Ph.D. 2010-14 FIS Distinguished Professor of Computer Science School of Computing, UNF Classification of Distributed Computing Systems
More informationChapter 2. Why is some hardware better than others for different programs?
Chapter 2 1 Performance Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational motivation Why is some hardware better than
More informationParallel Programming Survey
Christian Terboven 02.09.2014 / Aachen, Germany Stand: 26.08.2014 Version 2.3 IT Center der RWTH Aachen University Agenda Overview: Processor Microarchitecture Shared-Memory
More informationWHITE PAPER FUJITSU PRIMERGY SERVERS PERFORMANCE REPORT PRIMERGY BX620 S6
WHITE PAPER PERFORMANCE REPORT PRIMERGY BX620 S6 WHITE PAPER FUJITSU PRIMERGY SERVERS PERFORMANCE REPORT PRIMERGY BX620 S6 This document contains a summary of the benchmarks executed for the PRIMERGY BX620
More informationKashif Iqbal - PhD Kashif.iqbal@ichec.ie
HPC/HTC vs. Cloud Benchmarking An empirical evalua.on of the performance and cost implica.ons Kashif Iqbal - PhD Kashif.iqbal@ichec.ie ICHEC, NUI Galway, Ireland With acknowledgment to Michele MicheloDo
More informationPower Benchmarking: A New Methodology for Analyzing Performance by Applying Energy Efficiency Metrics
Power Benchmarking: A New Methodology for Analyzing Performance by Applying Energy Efficiency Metrics June 2, 2006 Elisabeth Stahl Industry Solutions and Proof of Concept Centers IBM Systems and Technology
More informationUnderstanding the Performance of an X550 11-User Environment
Understanding the Performance of an X550 11-User Environment Overview NComputing's desktop virtualization technology enables significantly lower computing costs by letting multiple users share a single
More informationPerformance metrics for parallel systems
Performance metrics for parallel systems S.S. Kadam C-DAC, Pune sskadam@cdac.in C-DAC/SECG/2006 1 Purpose To determine best parallel algorithm Evaluate hardware platforms Examine the benefits from parallelism
More informationScalability evaluation of barrier algorithms for OpenMP
Scalability evaluation of barrier algorithms for OpenMP Ramachandra Nanjegowda, Oscar Hernandez, Barbara Chapman and Haoqiang H. Jin High Performance Computing and Tools Group (HPCTools) Computer Science
More informationThe Value of High-Performance Computing for Simulation
White Paper The Value of High-Performance Computing for Simulation High-performance computing (HPC) is an enormous part of the present and future of engineering simulation. HPC allows best-in-class companies
More informationSchedulability Analysis for Memory Bandwidth Regulated Multicore Real-Time Systems
Schedulability for Memory Bandwidth Regulated Multicore Real-Time Systems Gang Yao, Heechul Yun, Zheng Pei Wu, Rodolfo Pellizzoni, Marco Caccamo, Lui Sha University of Illinois at Urbana-Champaign, USA.
More informationMEng, BSc Applied Computer Science
School of Computing FACULTY OF ENGINEERING MEng, BSc Applied Computer Science Year 1 COMP1212 Computer Processor Effective programming depends on understanding not only how to give a machine instructions
More informationBuilding a Top500-class Supercomputing Cluster at LNS-BUAP
Building a Top500-class Supercomputing Cluster at LNS-BUAP Dr. José Luis Ricardo Chávez Dr. Humberto Salazar Ibargüen Dr. Enrique Varela Carlos Laboratorio Nacional de Supercómputo Benemérita Universidad
More informationIntroduction to High Performance Cluster Computing. Cluster Training for UCL Part 1
Introduction to High Performance Cluster Computing Cluster Training for UCL Part 1 What is HPC HPC = High Performance Computing Includes Supercomputing HPCC = High Performance Cluster Computing Note: these
More informationAnalysis of Parallel Software Development using the
CTWatch Quarterly November 2006 46 Analysis of Parallel Software Development using the Relative Development Time Productivity Metric Introduction As the need for ever greater computing power begins to
More information! Metrics! Latency and throughput. ! Reporting performance! Benchmarking and averaging. ! CPU performance equation & performance trends
This Unit CIS 501 Computer Architecture! Metrics! Latency and throughput! Reporting performance! Benchmarking and averaging Unit 2: Performance! CPU performance equation & performance trends CIS 501 (Martin/Roth):
More informationOverlapping Data Transfer With Application Execution on Clusters
Overlapping Data Transfer With Application Execution on Clusters Karen L. Reid and Michael Stumm reid@cs.toronto.edu stumm@eecg.toronto.edu Department of Computer Science Department of Electrical and Computer
More informationHPC Deployment of OpenFOAM in an Industrial Setting
HPC Deployment of OpenFOAM in an Industrial Setting Hrvoje Jasak h.jasak@wikki.co.uk Wikki Ltd, United Kingdom PRACE Seminar: Industrial Usage of HPC Stockholm, Sweden, 28-29 March 2011 HPC Deployment
More informationPerformance of HPC Applications on the Amazon Web Services Cloud
Cloudcom 2010 November 1, 2010 Indianapolis, IN Performance of HPC Applications on the Amazon Web Services Cloud Keith R. Jackson, Lavanya Ramakrishnan, Krishna Muriki, Shane Canon, Shreyas Cholia, Harvey
More informationParallelism and Cloud Computing
Parallelism and Cloud Computing Kai Shen Parallel Computing Parallel computing: Process sub tasks simultaneously so that work can be completed faster. For instances: divide the work of matrix multiplication
More informationSoftware Development around a Millisecond
Introduction Software Development around a Millisecond Geoffrey Fox In this column we consider software development methodologies with some emphasis on those relevant for large scale scientific computing.
More informationGraphic Chartiles and High Performance Computing
Center for Information Services and High Performance Computing (ZIH) Leistungsanalyse von Rechnersystemen Data Presentation Nöthnitzer Straße 46 Raum 1026 Tel. +49 351-463 - 35048 Holger Brunst (holger.brunst@tu-dresden.de)
More informationPARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN
1 PARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN Introduction What is cluster computing? Classification of Cluster Computing Technologies: Beowulf cluster Construction
More informationSeagate HPC /Big Data Business Tech Talk. December 2014
Seagate HPC /Big Data Business Tech Talk December 2014 Safe Harbor Statement This document contains forward-looking statements within the meaning of Section 27A of the Securities Act of 1933, and Section
More informationThe Green Index: A Metric for Evaluating System-Wide Energy Efficiency in HPC Systems
202 IEEE 202 26th IEEE International 26th International Parallel Parallel and Distributed and Distributed Processing Processing Symposium Symposium Workshops Workshops & PhD Forum The Green Index: A Metric
More informationIntroduction to GPU Programming Languages
CSC 391/691: GPU Programming Fall 2011 Introduction to GPU Programming Languages Copyright 2011 Samuel S. Cho http://www.umiacs.umd.edu/ research/gpu/facilities.html Maryland CPU/GPU Cluster Infrastructure
More informationEEM 486: Computer Architecture. Lecture 4. Performance
EEM 486: Computer Architecture Lecture 4 Performance EEM 486 Performance Purchasing perspective Given a collection of machines, which has the» Best performance?» Least cost?» Best performance / cost? Design
More informationCloud computing. Intelligent Services for Energy-Efficient Design and Life Cycle Simulation. as used by the ISES project
Intelligent Services for Energy-Efficient Design and Life Cycle Simulation Project number: 288819 Call identifier: FP7-ICT-2011-7 Project coordinator: Technische Universität Dresden, Germany Website: ises.eu-project.info
More informationComparison of Windows IaaS Environments
Comparison of Windows IaaS Environments Comparison of Amazon Web Services, Expedient, Microsoft, and Rackspace Public Clouds January 5, 215 TABLE OF CONTENTS Executive Summary 2 vcpu Performance Summary
More informationDELL VS. SUN SERVERS: R910 PERFORMANCE COMPARISON SPECint_rate_base2006
DELL VS. SUN SERVERS: R910 PERFORMANCE COMPARISON OUR FINDINGS The latest, most powerful Dell PowerEdge servers deliver better performance than Sun SPARC Enterprise servers. In Principled Technologies
More informationSUBJECT: SOLIDWORKS HARDWARE RECOMMENDATIONS - 2013 UPDATE
SUBJECT: SOLIDWORKS RECOMMENDATIONS - 2013 UPDATE KEYWORDS:, CORE, PROCESSOR, GRAPHICS, DRIVER, RAM, STORAGE SOLIDWORKS RECOMMENDATIONS - 2013 UPDATE Below is a summary of key components of an ideal SolidWorks
More informationUnit 4: Performance & Benchmarking. Performance Metrics. This Unit. CIS 501: Computer Architecture. Performance: Latency vs.
This Unit CIS 501: Computer Architecture Unit 4: Performance & Benchmarking Metrics Latency and throughput Speedup Averaging CPU Performance Performance Pitfalls Slides'developed'by'Milo'Mar0n'&'Amir'Roth'at'the'University'of'Pennsylvania'
More informationWhen Prefetching Works, When It Doesn t, and Why
When Prefetching Works, When It Doesn t, and Why JAEKYU LEE, HYESOON KIM, and RICHARD VUDUC, Georgia Institute of Technology In emerging and future high-end processor systems, tolerating increasing cache
More informationHigh Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/2015. 2015 CAE Associates
High Performance Computing (HPC) CAEA elearning Series Jonathan G. Dudley, Ph.D. 06/09/2015 2015 CAE Associates Agenda Introduction HPC Background Why HPC SMP vs. DMP Licensing HPC Terminology Types of
More informationOracle Database Scalability in VMware ESX VMware ESX 3.5
Performance Study Oracle Database Scalability in VMware ESX VMware ESX 3.5 Database applications running on individual physical servers represent a large consolidation opportunity. However enterprises
More informationInterconnect Analysis: 10GigE and InfiniBand in High Performance Computing
Interconnect Analysis: 10GigE and InfiniBand in High Performance Computing WHITE PAPER Highlights: There is a large number of HPC applications that need the lowest possible latency for best performance
More informationMulticore Parallel Computing with OpenMP
Multicore Parallel Computing with OpenMP Tan Chee Chiang (SVU/Academic Computing, Computer Centre) 1. OpenMP Programming The death of OpenMP was anticipated when cluster systems rapidly replaced large
More informationHPC Wales Skills Academy Course Catalogue 2015
HPC Wales Skills Academy Course Catalogue 2015 Overview The HPC Wales Skills Academy provides a variety of courses and workshops aimed at building skills in High Performance Computing (HPC). Our courses
More informationMEng, BSc Computer Science with Artificial Intelligence
School of Computing FACULTY OF ENGINEERING MEng, BSc Computer Science with Artificial Intelligence Year 1 COMP1212 Computer Processor Effective programming depends on understanding not only how to give
More information64-Bit versus 32-Bit CPUs in Scientific Computing
64-Bit versus 32-Bit CPUs in Scientific Computing Axel Kohlmeyer Lehrstuhl für Theoretische Chemie Ruhr-Universität Bochum March 2004 1/25 Outline 64-Bit and 32-Bit CPU Examples
More informationBLM 413E - Parallel Programming Lecture 3
BLM 413E - Parallel Programming Lecture 3 FSMVU Bilgisayar Mühendisliği Öğr. Gör. Musa AYDIN 14.10.2015 2015-2016 M.A. 1 Parallel Programming Models Parallel Programming Models Overview There are several
More informationIntroduction to GPU hardware and to CUDA
Introduction to GPU hardware and to CUDA Philip Blakely Laboratory for Scientific Computing, University of Cambridge Philip Blakely (LSC) GPU introduction 1 / 37 Course outline Introduction to GPU hardware
More informationRED HAT ENTERPRISE VIRTUALIZATION PERFORMANCE: SPECVIRT BENCHMARK
RED HAT ENTERPRISE VIRTUALIZATION PERFORMANCE: SPECVIRT BENCHMARK AT A GLANCE The performance of Red Hat Enterprise Virtualization can be compared to other virtualization platforms using the SPECvirt_sc2010
More informationBenchmarks and Performance Tests
Chapter 7 Benchmarks and Performance Tests 7.1 Introduction It s common sense everyone agrees that the best way to study the performance of a given system is to run the actual workload on the hardware
More informationComputing Performance Benchmarks among CPU, GPU, and FPGA
Computing Performance Benchmarks among CPU, GPU, and FPGA MathWorks Authors: Christopher Cullinan Christopher Wyant Timothy Frattesi Advisor: Xinming Huang Abstract In recent years, the world of high performance
More informationMeasuring Computer Systems: How to Measure Performance
: How to Measure Performance V E R I T A S Margo Seltzer, Aaron Brown Harvard University Division of Engineering and Applied Sciences {margo, abrown}@eecs.harvard.edu Abstract Benchmarks shape a field
More informationCollaborative and Interactive CFD Simulation using High Performance Computers
Collaborative and Interactive CFD Simulation using High Performance Computers Petra Wenisch, Andre Borrmann, Ernst Rank, Christoph van Treeck Technische Universität München {wenisch, borrmann, rank, treeck}@bv.tum.de
More information18-742 Lecture 4. Parallel Programming II. Homework & Reading. Page 1. Projects handout On Friday Form teams, groups of two
age 1 18-742 Lecture 4 arallel rogramming II Spring 2005 rof. Babak Falsafi http://www.ece.cmu.edu/~ece742 write X Memory send X Memory read X Memory Slides developed in part by rofs. Adve, Falsafi, Hill,
More informationGPU Hardware and Programming Models. Jeremy Appleyard, September 2015
GPU Hardware and Programming Models Jeremy Appleyard, September 2015 A brief history of GPUs In this talk Hardware Overview Programming Models Ask questions at any point! 2 A Brief History of GPUs 3 Once
More informationTrends in High-Performance Computing for Power Grid Applications
Trends in High-Performance Computing for Power Grid Applications Franz Franchetti ECE, Carnegie Mellon University www.spiral.net Co-Founder, SpiralGen www.spiralgen.com This talk presents my personal views
More informationNext Generation GPU Architecture Code-named Fermi
Next Generation GPU Architecture Code-named Fermi The Soul of a Supercomputer in the Body of a GPU Why is NVIDIA at Super Computing? Graphics is a throughput problem paint every pixel within frame time
More informationBenchmarking the Amazon Elastic Compute Cloud (EC2)
Benchmarking the Amazon Elastic Compute Cloud (EC2) A Major Qualifying Project Report submitted to the Faculty of the WORCESTER POLYTECHNIC INSTITUTE in partial fulfillment of the requirements for the
More informationbenchmarking Amazon EC2 for high-performance scientific computing
Edward Walker benchmarking Amazon EC2 for high-performance scientific computing Edward Walker is a Research Scientist with the Texas Advanced Computing Center at the University of Texas at Austin. He received
More informationBenchmarking for High Performance Systems and Applications. Erich Strohmaier NERSC/LBNL Estrohmaier@lbl.gov
Benchmarking for High Performance Systems and Applications Erich Strohmaier NERSC/LBNL Estrohmaier@lbl.gov HPC Reference Benchmarks To evaluate performance we need a frame of reference in the performance
More informationMulti-Threading Performance on Commodity Multi-Core Processors
Multi-Threading Performance on Commodity Multi-Core Processors Jie Chen and William Watson III Scientific Computing Group Jefferson Lab 12000 Jefferson Ave. Newport News, VA 23606 Organization Introduction
More informationPower Efficiency Metrics for the Top500. Shoaib Kamil and John Shalf CRD/NERSC Lawrence Berkeley National Lab
Power Efficiency Metrics for the Top500 Shoaib Kamil and John Shalf CRD/NERSC Lawrence Berkeley National Lab Power for Single Processors HPC Concurrency on the Rise Total # of Processors in Top15 350000
More informationSGI. High Throughput Computing (HTC) Wrapper Program for Bioinformatics on SGI ICE and SGI UV Systems. January, 2012. Abstract. Haruna Cofer*, PhD
White Paper SGI High Throughput Computing (HTC) Wrapper Program for Bioinformatics on SGI ICE and SGI UV Systems Haruna Cofer*, PhD January, 2012 Abstract The SGI High Throughput Computing (HTC) Wrapper
More informationHigh Performance Computing in CST STUDIO SUITE
High Performance Computing in CST STUDIO SUITE Felix Wolfheimer GPU Computing Performance Speedup 18 16 14 12 10 8 6 4 2 0 Promo offer for EUC participants: 25% discount for K40 cards Speedup of Solver
More informationQuiz for Chapter 1 Computer Abstractions and Technology 3.10
Date: 3.10 Not all questions are of equal difficulty. Please review the entire quiz first and then budget your time carefully. Name: Course: Solutions in Red 1. [15 points] Consider two different implementations,
More informationOracle Database Reliability, Performance and scalability on Intel Xeon platforms Mitch Shults, Intel Corporation October 2011
Oracle Database Reliability, Performance and scalability on Intel platforms Mitch Shults, Intel Corporation October 2011 1 Intel Processor E7-8800/4800/2800 Product Families Up to 10 s and 20 Threads 30MB
More informationCluster Scalability of ANSYS FLUENT 12 for a Large Aerodynamics Case on the Darwin Supercomputer
Cluster Scalability of ANSYS FLUENT 12 for a Large Aerodynamics Case on the Darwin Supercomputer Stan Posey, MSc and Bill Loewe, PhD Panasas Inc., Fremont, CA, USA Paul Calleja, PhD University of Cambridge,
More informationPerformance Tuning of a CFD Code on the Earth Simulator
Applications on HPC Special Issue on High Performance Computing Performance Tuning of a CFD Code on the Earth Simulator By Ken ichi ITAKURA,* Atsuya UNO,* Mitsuo YOKOKAWA, Minoru SAITO, Takashi ISHIHARA
More informationTableau Server Scalability Explained
Tableau Server Scalability Explained Author: Neelesh Kamkolkar Tableau Software July 2013 p2 Executive Summary In March 2013, we ran scalability tests to understand the scalability of Tableau 8.0. We wanted
More informationWhite Paper. Recording Server Virtualization
White Paper Recording Server Virtualization Prepared by: Mike Sherwood, Senior Solutions Engineer Milestone Systems 23 March 2011 Table of Contents Introduction... 3 Target audience and white paper purpose...
More informationHigh Performance Computing. Course Notes 2007-2008. HPC Fundamentals
High Performance Computing Course Notes 2007-2008 2008 HPC Fundamentals Introduction What is High Performance Computing (HPC)? Difficult to define - it s a moving target. Later 1980s, a supercomputer performs
More informationLS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance
11 th International LS-DYNA Users Conference Session # LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance Gilad Shainer 1, Tong Liu 2, Jeff Layton 3, Onur Celebioglu
More informationPerformance analysis of parallel applications on modern multithreaded processor architectures
Available online at www.prace-ri.eu Partnership for Advanced Computing in Europe Performance analysis of parallel applications on modern multithreaded processor architectures Maciej Cytowski* a, Maciej
More informationCloud Performance Benchmark Series
Cloud Performance Benchmark Series Amazon EC2 CPU Speed Benchmarks Kalpit Sarda Sumit Sanghrajka Radu Sion ver..7 C l o u d B e n c h m a r k s : C o m p u t i n g o n A m a z o n E C 2 2 1. Overview We
More informationBENCHMARKING AND CAPACITY PLANNING
CHAPTER 4 BENCHMARKING AND CAPACITY PLANNING This chapter deals with benchmarking and capacity planning of performance evaluation for computer and telecommunication systems. We will address types of benchmark
More informationHP Z Turbo Drive PCIe SSD
Performance Evaluation of HP Z Turbo Drive PCIe SSD Powered by Samsung XP941 technology Evaluation Conducted Independently by: Hamid Taghavi Senior Technical Consultant June 2014 Sponsored by: P a g e
More informationBenchmarking Large Scale Cloud Computing in Asia Pacific
2013 19th IEEE International Conference on Parallel and Distributed Systems ing Large Scale Cloud Computing in Asia Pacific Amalina Mohamad Sabri 1, Suresh Reuben Balakrishnan 1, Sun Veer Moolye 1, Chung
More informationAdvanced discretisation techniques (a collection of first and second order schemes); Innovative algorithms and robust solvers for fast convergence.
New generation CFD Software APUS-CFD APUS-CFD is a fully interactive Arbitrary Polyhedral Unstructured Solver. APUS-CFD is a new generation of CFD software for modelling fluid flow and heat transfer in
More information